Git is a very popular version control system for tracking changes in computer files and coordinating work on those files among multiple people (Wikipedia). It is well used in Data Science projects to keep track of code and maintain parallel development. Git can be used in a very complicated way, however, for Data Scientist, we can keep it simple. In this post, I am going to walk through the main use cases if you are a "Solo Master".
Note. There are many awesome resources out there talking about "what is Git", and "the basic concept in Git", I would refer to the official Git website on this "Getting Started -- Git Basic" Now we can start with some cool project! First, let's go to Github to create an empty project, then start to config it properly on your local laptop. Case 1. One working space, nothing goes wrong This is the ideal and simplest situation, what you need to do is just add more files to one commit, commit the code, and then push to the remote master branch. Life is so easy under such situation. Case 2. One working space, mistake before "git add" This always happen ... you started playing with your idea, and added a few draft code in the file, and quickly figured out this idea does not work, and now you want to get back the clean slate. How to do that? Fortunately, if you didn't run any "git add" on the new file, this is very easy. For more details, please refer to "Git checkout". Case 3. One working space, mistake before "git commit" You thought the idea is going to work, added a few files, made some changes, did a few "git add", and finally, you figured out the result is not right. Now you want to get rid of the mess and back to the nice, right, old code. For more details, please refer to "Git reset". Case 4. One working space, mistake before "git push" You went even further this time, not only you did "git add", but also this modification took a few hours and you also did a few "git commit"! Ah, another huge mistake, what to do?! For more details, please refer to "Git reset". Case 5. One working space, mistake after "git push" You pushed the code to production, and other members found this is a big mistake/bug. Now you need to revert the code back to where it was. For more details, please refer to "Git revert". Case 6. Multiple working spaces You have two working spaces, one is in your company laptop, one is in your company work station. You develop feature 2 in one working space, and feature 3 in another working space. Now you see the problem, and the solution is to use "git pull" first. "git pull" = "git fetch" + "git merge" or "git fetch" + "git rebase" For the details, refer to "Git pull". Remember, now the remote branch looks like following Now, as long as you develop each individual features in each working space, this process would have no problem. This is considered a better practice than working on the same feature in different working space. Because if the same file is modified in different spaces, the "merge" process will have many conflicts and resolving that would be a huge deal for "solo masters".
Great, now after these simple case studies, you become the real "solo master" in Git. You will never lose any code (it will always be pushed to the cloud) or worry about code inconsistency in multiple working spaces (as long as "git pull" is used correctly). Enjoy using Git!
0 Comments
|
AuthorData Magician Archives
October 2017
Categories
All
|