In the era of programming, Git is quite popular among developers. So, In this blog We’ll try understand the concept of Git and why it’s gaining so much of popularity.
Let’s get started!
What is Version Control?
Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. So, we can place any file in the computer on version control.
Git is a Version Control System created by Linus Torvalds you know the guy who has created Linux kernel. Now, Why he created another technology called Git. So, What happens you know, when you talk about Linux kernel it’s Open-source and anyone can contribute. So, if you living in any way in the world you can contribute to the Linux kernel. Now, what you can do, you can download the copy of the Linux kernel and you can add your features and you can submit it there. By the way, they will not accept your request by default, but you have to provide some new features. The important thing is you can be a contributor of Linux kernel. So, In this world we have millions of programmers if everyone is trying to work on one software how they can manage it. Is is that easy to manage? No, It’s damn difficult, trust me. Even if you have five people working on a project it will be difficult for you to manage that project. So, What he came up with the concept called Git. So, It handles merging of different source code form different people and maintaining versions so those things are provided by Git.
Distributed Version Control vs Centralized Version Control
Before Distributed Version Control System, developers were worked with Centralized Version Control System (Subversion). Let’s talk about Centralized Version Control (subversion) you will be having multiple computers or maybe you will be having one computer and you will be having one server so all your versions will be stored on this server. Now, let’s talk about the drawback of this, you have a working copy that is perfectly fine but the versions are maintained in the server. What if this server fails, what if you lose all your data from server it may happen right? What if you are travelling somewhere and you don’t have an internet connection or maybe you are somewhere in a remote place where you don’t have an internet connection. How can you get this repository that’s new term right? A repository is a place where you will have your source code or you can imagine a foreground server. If there is no internet connection there will be an issue and there is a single point of failure. If this fails you lose everything so what if you have multiple machines normally what happens if you are working for a company every computer every project data will be stored on one server what if everyone can have their copy. So we can create a copy of itself so we will be having a working copy and we will having something called a Local Repository.
So, this is your local repository on the server and we will be having a remote repository. Now, the advantage you have is even if your server fails that’s fine you have a local repository on your system and you can mirror your local copy to the server next time if your server fails now this time what we are doing is every machine or every developer has their copy so this is called as Distributed version control system. So, we are distributing the repository so every machine will have their own local repository now even if you’re in flight or maybe if you’re in some remote place if you don’t have an internet connection that’s fine you can work on your project and you can create different versions in your local repository and when you connected to the internet just push it on the server that will be awesome and that’s your Distributed version control system.
But the question-arises, how will you implement all these things, we can create our local copy, so we can download git for free first of all it’s free and open-source or you can download the git from the internet. If you are using MAC OS you already have git there but if you’re Linux you can install as you normally install other software’s we’re using the package manager or if you are using Windows, we have git bash. But question-aries how will you get the remote version can I create my own remote repository that is possible right you can buy your server you can add your git service but do we have to really buy those services we don’t have to actually we have two different webs, in fact, we have multiple implementations one of them is GitHub, GitLab or we can use Bitbucket. We also have some UI tools to work with them as we have SourceTree.
Before, we start working with Git commands, it is necessary that you understand what it represents.
What is a Repository?
A repository a.k.a repo is nothing but a collection of source code.
There are four fundamental elements in the Git Workflow.
Working Directory, Staging Area, Local Repository and Remote Repository.
By looking at the image below, you will get some insights about how git works.
Let’s understand the Git Workflow in more detail.
The Working Tree
The Working Tree is the area where you are currently working. It is where your files live. This area is also known as the “untracked” area of git. Any changes to files will be marked and seen in the Working Tree. Here if you make changes and do not explicitly save them to git, you will lose the changes made to your files. This loss of change occurs because git is not aware of the files or changes in the Working Tree until you tell it to pay attention to them. If you make changes to files in your working tree git will recognize that are modified, but until you tell git “Hey pay attention to these files,” it won’t save anything that goes on in them.
How can you see what is in your Working Tree? Run the command git status. This command will show you two things: The files in your Working Tree and the files in your Staging Area. It will look like something in the image below if you don’t have anything in your Staging Area.
e.g: Let’s create two text files in a folder of any drive, and add some content to it and try to execute the git status command.
So, First, you need to run “git init” command, because it will create a new git repository and then after you can perform “git status” the command to check to your in Working Tree. You can see that the untracked files are shown in red color. If you don’t run the git init command you will get an error, which looks like something in the image below.
So, It’s necessary to initialize our project with git. To run git commands.
The Staging Area (Index)
The Staging Area is when git starts tracking and saving changes that occur in files. The saved changes reflect in the .git directory. That is about it when it comes to the Staging Area. You tell git that I want to track these specific files, then git says okay and moves them from you Working Tree to Staging Area and says “Cool, I know about this file in its entirety.” However, if you make any more additional changes after adding a file to the Staging Area, git will not know about those specific changes until you tell it to see them. You explicitly have to tell git, to notice the edits in your files.
How can you see what is in your Staging Area? Run the command “git status” like before. It will look something like the image below.
How do you add files to your Staging Area? Running the command
git add [filename] will add a specific file to the Staging Area from your Working Tree. If you want to add everything from the Working Tree, then run the command
git add .. The dot (.) operator is a wildcard meaning all files.
git add .: Stages new and modified files, without deleting.
git add -a: Stages all files.
git add -u : Stages modified and deleted, without new.
Suppose, you modified a file that is not yet staged but after some time you think to revert the changes of a modified file. How do you do? There is a command in git is “git checkout file-name“.
To remove the file from the staging area the command is “git rm –cached file-name“.
Unstage the file “one.txt” from the staging area.
The Local Repository
The Local Repository is everything in your .git directory. Mainly what you will see in your Local Repository are all of your checkpoints or commits. It is the area that saves everything (so don’t delete it). That’s it.
How do you add items from your Staging Area to your Local Repository? The git command is
git commit -m “commit message” taking all the changes in the Staging Area, wraps them together and puts them in your Local Repository. A commit is simply a checkpoint telling git to track all changes that have occurred up to this point using our last commit as a comparison. After committing, your staging area will be empty.
Let’s add some contents in our “one.txt” and “two.txt” files and commit it.
Let’s say, you did some changes in the “one.txt” file and commit it.
If you want to view the contents of a file you can take the help of the “git show” command.
How is git commit command works internally?
The above image shows the working of git commit command. In which the two blocks represent the commit with the unique SHA commit key. In Git, the default branch is master branch. Suppose, you are on the master branch and you did a commit now git internally will create a block (node) of a commit you did. In git, we have a pointer called HEAD (It is the reference to the commit in the current branch in our case i.e master). When you commit something the HEAD will be pointed to our commit in our case the commit hash key is 33132f. If you made another commit now HEAD will move and point to commit having a hash key 473cc7 which means the parent of 473cc7 commit is 33132f because it holds the reference of the previous commit. As you see our last commit is 473cc7 and the parent of this commit is 33132f and our HEAD is pointing to 473cc7 commit.
Git Log command
Git logs allow you to review and read a history of everything that happens to a repository. The history is built using
git log a simple tool with a ton of options for displaying commit history. A Git log is a running record of commits. A full log has the following pieces:
- A commit hash (SHA1 40 character checksum of the commit contents). Because it is generated based on the commit contents it is unique.
- Commit Author metadata: The name and email address of the author of the commit.
- Commit Date metadata: A date timestamp for the time of the commit.
- Commit title/message: The overview of the commit as written in the commit message.
This is a snippet of the log, showing one commit. We have a commit SHA1 hash, the author, the date, and the commit message, explaining what happened in the commit. This layout is the default look of the log.
Git has something called Commit Limiting to make it easier to narrow down hundreds or thousands of commits to the ones you want to review.
Git is using SHA-1 hash in order to check files, the probability of having a hash conflict is near zero. So have fun !!
Tell Git who you are using “git config” command.
In Git, you have to mention your Git username and email address, since every Git commit will use this information to identify you as the author.
- To set user-name use the git config –global user.name “YOUR_USERNAME” command.
- To set user-email use the git config –global user.email “email@example.com” command.
- To check the info you just provided use the command : git config –global –list
Use of –global flag in git config command
The global level configuration is user-specific, meaning it is applied to an operating system user. Global configuration values are stored in a file that is located in a user’s home directory. ~/.gitconfig on UNIX systems and C:\Users\<username>\.gitconfig on windows.
COMPARING files with Git diff
The main objective of version controlling is to enable you to work with different versions of files. Git provides a command diff to let you to compare different versions of your files.
For e.g: A long time ago, you modified “one.txt” file. Now you want to view what you have changed so here we use the “git diff <filename>” command.
Suppose, If you want to compare the files in staging area with our local repository for that you have to use the “–staged” flag with the git diff command.
Branching is a feature available in the most modern version control system. Branching in other version control systems can be an expensive operation in both time and disk space. In Git, branches are a part of your everyday development process. Git branches are effectively a pointer to a snapshot of your changes. When you want to add a new feature or fix a bug-no matter how big or how small -you spawn a new branch to encapsulate your changes. This makes it harder for unstable code to get merged into the main codebase, and it gives you the chance to clean up your future’s history before merging it into the main branch.
The diagram above visualizes a repository with two isolated lines of development, one for a little feature, and one for a longer-running feature. By developing them in branches, it’s not only possible to work on both of them in parallel, but it also keeps the main master branch free from questionable code. A branch represents an independent line of development. Branches serve as an abstraction for the edit/stage/commit process.
The git branch command lets you create, list, rename and delete branches. It doesn’t let you switch between branches. For this reason, the
git branch is tightly integrated with the
git checkout and
git merge commands.
For e.g: Suppose, you are working on some feature for our Website so you will create some file in your working directory. But, you feel that what happens when an error occurs, if errors occur it will disturb the flow of our website because it will disturb our master branch. So, you think let’s create another branch named “feature-dev” and try to implement the feature in this branch. If the newly added feature gives an error, then no problem? because our previous code will be working properly. How? because it’s in a different branch and our previous code is in the master branch.
git branch <branch-name>
In the above image, we see that the “master” occurs in green color with an asterisk (*) symbol which means we can currently on the master branch then use the command
ls to view the files in the master branch as you can see we also have a branch called ‘feature-dev’ . So, the “git branch” command is used to view the branches in the repository.
As you see, we created another branch called feature–dev which has all files the same as the master branch. Now, we will be created a file named feature.txt in this branch.
git checkout <branch-name> is used to switch the branch.
Then after, you pushed your login feature on the remote repository. Let’s have a look at our remote repository.
You see that our remote repository is updated with the new branch and having the “feature.txt” file. After adding the feature you think it’s working perfectly fine, let’s merge this branch into the master branch means our default branch. So, to merge the branch we have to use
git merge <branch-name>command.
To merge, you have to first checkout to the master branch then merge the “feature” branch. The merge command will open a editor to write a commit message. If you don’t want to write the commit message, just type :wq and hit enter to exit from the editor.
Let’s view our commit history.
Now, after successfully merging of the “feature-dev” branch, there is no use of this branch now. So, let’s delete that branch.
Deleting a local/remote
To delete the branch from your system. The command is
git branch -d <branch-name> .
As you see in the above image, the login-feature branch is successfully deleted from our system. But, if you see in the remote repository the login-feature is still there as you see in the below image.
So, to delete this branch from remote repo the command is
git push origin --delete <branch-name> .
We successfully deleted the branch from the remote repository.
git push command
The git push command is used to upload local repository content to a remote repository. Pushing is how you transfer commits from your local repository to a remote repository. It’s the counterpart to
git fetch but whereas fetching imports commits to local branches, pushing exports commits to remote branches. Remote branches are configured using the
git remote command. Pushing has the potential to overwrite changes, caution should be taken when pushing.
git push <remote> <branch>
Push the specified branch to <remote>, along with all the necessary commits and internal objects. This creates a local branch in the destination repository. To prevent you from overwriting commits, Git won’t let you push when it results in a non-fast-forward merge in the destination repository.
git push <remote> --force
Same as the above command, but force the push even if it results in a non-fast-forward merge. Do not use –force flag unless you’re absolutely sure you know what you’re doing.
git push <remote> --all
Push all of your local branches to the specified remote.
The above diagram shows what happen when your local master has progressed past the central repository’s master and you publish changes by running
git push origin master. Notice how git push is essentially the same as running git merge master from inside the git repository.
After adding a feature, making changes or other tasks in the local repository, you may use the push command for uploading these changes to the remote repository (e.g GitHub) so other team members can see it and may update their project accordingly.
A simple example of using the push command can be:
git push origin master
Where origin is the remote repository name (the default name). You may replace it with your repository name that was assigned at the time of the creation of the repository.
git push -u the command is equivalent to -set-upstream. The -u flag is used to set origin as the upstream remote in your git config. As you push a branch successfully or up to date, it adds an upstream reference.
As you push the local branch with
git push -uthe option, that local branch is linked with the remote branch automatically. The advantage is, you may use git pull without any arguments.
This is followed by providing the branch name. I used a master that you may replace with your branch name.
So, as expected we got an error because we don’t have a remote repository. So, first of all, lets us create a remote repository on the GitHub website with the name git-tutorials. Which is empty at this stage.
With this, we have created our remote repository. GitHub also provides some basic git commands as you see in the image above and also with HTTPS and SSH protocol.
To add your project to GitHub, use the git remote add command on the terminal, in the directory your repository is stored at.
git remote add <remote-name><remote URL>
The git remote add command takes two arguments:
- A remote name, for example, “origin”
- A remote URL, Which you will see in the image below.
After that, you can run the git push command to push your code to the GitHub repository.
Our code is successfully pushed to the GitHub repository.
Git Pull Command
The git pull command is used to fetch and download content from a remote repository and immediately update the local repository to match the content. Merging remote upstream changes into your local repository is a common task in Git-based collaboration workflows. The
git pull command is actually a combination of two other commands, git fetch followed by git merge. In the first stage of operation, git pull will execute a git fetch scoped to the local branch that HEAD ( means the reference to the current commit) is pointed at. Once the content is downloaded, git pull will enter a merge workflow. A new merge commit will be created and HEAD updated to point at the new commit.
For e.g: If someone from your team made some changes in a file or created some file on a remote repository. Let’s say he/she created a file named as
three.txt, but you don’t know what he/she changes on the remote repository. Let’s say you have created another file named as
read.txt, and that after you commit the file and when you try to push that file on your remote repository git will not allow you to push your code because the remote repository contains work that you do not have locally. So, In this case, the git pull command is used to fetch and download content from a remote repo to local repo.
Now, you’ll create another file in the local repository (in your system).
After creating the “read.txt” file you will add the file to staging area using
git add “read.txt” command after that you will commit that file using
git commit -m “added read.txt file” command and after that, you will push that file on remote repository using the command
git push origin master.
In the above image, you see that we got an error while performing
git push command. Now, to solve this error new need to run
git pull origin master command.
In the above image, you see that your file is successfully pushed to the master branch because we use
git pull origin master the command to up-to-date our local repository from a remote repository.
Note: About Merge Conflicts
Merge conflicts happen when you merge branches that have competing commits, and Git needs your help to decide which changes to incorporate in the final merge.
Git can often resolve differences between branches and merge them automatically. Usually, the changes are on different lines, or even in different files, which makes the merge simple for computers to understand. However, sometimes there are competing changes that Git can’t resolve without your help. Often, merge conflicts happen when people make different changes to the same file, or when one person edits a file and another person deletes the same file. If you have a merge conflict on the command line, you cannot push your local changes to GitHub until you resolve the merge conflict locally on your computer.
To resolve merge conflicts run merge conflict resolution tools. The command is:
git mergetool .
When we use the “git fetch” command, we fetch the commits of a remote repository into our local repository. The fetched commits are saved as remote branches separate from the local branches. So, this helps in reviewing the commits before integrating them into the local working branches.
GIT PULL VS GIT FETCH
Now, you might be confused between the git pull and git fetch command, because these command do the same thing i.e. fetch the changes from the remote repository to Local repository. But hold on, git fetch command works differently.
The git pull command, fetches the commits from the remote repository and merges the commits to the local repository automatically. But, in the case, of git fetch command it doesn’t changes the local repository the changes are in origin/master (Is a remote branch, which is a local copy of the branch named “master” on the remote named “origin“) branch.
Local branch : master (on local system)
Remote Tracking branch : origin/master (on local system)
Remote branch : master (origin master on GitHub or remote server)
git fetch : updates remote-tracking branch from remote branch
e.x : updates (origin/master) from (origin master)
git merge : merges remote-tracking branch into local branch
e.x : merge (origin/master) into (master)
If you want to view the remote-tracking branch. Just run the command “git branch -r“.
For e.g: Let’s say you delete the “read.txt” file in the remote repository.
If I came back to my local branch or local repository, and run the command “git log” you didn’t find any changes in your commit log which you have done in your remote repository.
To get the commits from that remote branch you have to use the “git fetch origin” command. When I do “git fetch“, commit are downloaded but they are not integrated into our local branch. Our local history commit doesn’t change at all.
So, the changes have not been merged between the remote branch and local branch. Now to do this merging we have to use “git merge branch-name” command, but in our case, the branch is ‘origin/master‘. What it will do, it will merge “origin/master” branch into our local branch. When you merging, it will create a commit so, you have to give our commit basis and you can see commit is done.
When you again run the “git log” command, you will see one additional commit is added i.e. called merge commit.
git clone is a Git command-line utility that is used to target an existing repository and create a clone, or copy of the target repository. The command for cloning the repository is :
git clone <repository-link>
for e.x. : Let’s say you want to clone the TensorFlow repository.
In the above image, It’s a TensorFlow repository and we want to clone it means to download it in our system. We have to use the link to clone this repository.
Hope this helps you in learning Git.
Thanks & keep Learning.. 😊