What is LFS really? Private server LFS?

Below a couple questions regarding Ben’s Git Smart course. And maybe a bit of a feedback.

What is in reality the reason d’etre of LFS? Is that because GitHub throws up a fit about large files because they can’t offer that much free space to everyone? Was it created/incorporated into the git database because of that. Or was git including it anyway for optionality when it comes to large files?

How do I set up a way to track LFS but on my private home server instead? Not sure if this falls out of the course’s scope. But I recently set up one server with Ubuntu on a very old laptop and apart from storing some large system and disk backups it wasn’t getting much other use. Then your course inspired me to think about using my own server rather than (or in addition to) GitHub. After trial and error I managed to get it working. I can push git-tracked repos to my server. (I think I can pull but haven’t fully tested that yet). But what about LFS? And actually… if I have “tons” of space on my private server, do I really need to worry about LFS? Does it become pointless? (hence my first question up above).

My humble feedback: I think this course could be the perfect opportunity to teach people a basic way to push to other private repos outside the standard choice of GitHub, it’s probably just adding one short video. I had to google around a bit and read Ubuntu’s documentation in my case, but I think it would complement this course rather nicely. You can always give people the option to skip it if a private home server for them is not an option. Anyway, I do understand obviously that teaching the basic way to do this for free and hassle free with the online tools available out there, which also allow for public collaboration. But collaboration over a private network is also a thing so I thought I’d add that as an idea.

Hi Evangelos,

Thank you for your questions.

  1. LFS does not have anything to do with GitHub (the website). It is an extension that makes Git work with large files more efficiently. See here.

  2. Configuring your own Git server is beyond the scope of this course. You could take a look at GitLab CE and read the docs.


See also:

  • Atlassian Tutorials: Git LFS
  • Youtube: GitLFS - How to handle large files in Git - Lars Schneider - FOSSASIA Summit 2017

Thanks Nina. But doesn’t GitHub still store the files we track as LFS on their storage?
I fully understand that it wasn’t part of the scope, but in a separate suggestion to Ben in a different thread, I sort of argued that maybe it should be.

Anyway, great little course, I wish there was more! (again another suggestion to Ben). All the guys here are great instructors and should do more! I will certainly follow your links and further my knowledge, thanks.

At the end of Ben’s git course, he mentions some “links” to the other courses that we should follow if possible, it helps with revenues etc. which I totally respect. So where are those links? I might be blind…? I want to check out the Blender course, I do have a coupon for it for completing the Unreal C++ course, but the links I already have take me to the Udemy site.

I could totally take the Blender course on your website if that’s what you prefer, but I can’t seem to find a link and when one clicks “courses” in the menu bar up top, one still gets directed to Udemy anyway.

Maybe I’m missing a trick here.

They do. I think I misunderstood your text. What I meant was that GitHub enabled LTS support on their servers but the reason is probably not that they don’t want to offer that much free space to everyone. The maximum size of the repo 100 GB, and you are allowed to upload files that are not larger than 100 MB. With Git LFS, you can store files up to 2 GB on their server. On Bitbucket, you have different limits. According to this bug report, it is possible to upload larger files with Git LFS, so I assume the max file size is set on the server side.

Configuring servers is definitely way beyond the scope of the Git course. Most people use services like GitHub, Bitbucket or GitLab.

Regarding all the other suggestions, Ben reads them here in the forum on a regular basis. I don’t know if he will add more videos one day. At the moment, he is busy creating content for his upcoming maths course.

In the Resources of the last lecture. See here:

Ah ok, those are still in the udemy interface from the looks of it. I viewed the course on your site directly

On which site? Do you mean https://courses.gamedev.tv?

LFS exists to reduce repository bloat caused by git’s needs to re-store entire copies of “binary” files regardless of the size of the change made to them.

E.g. a 10MB text file with a 1KB change done 100x will consume somewhere around 11MB of space in the repository (guestimate). a 10MB binary file with a 1KB change done 100x will consume >1GB of space in the repository, because every time it has to store the whole 10MB again (plus any overheads).

So the solution to this, especially given that git commits are “immutable”, is to store the physical file elsewhere (since you have to store the entire thing anyway) but only retain a reference pointer or “shortcut file” to it in git.

That keeps the history of changes without bloating the size of the repo directly. LFS implementations can also try to do things like only offer the copies of the binary file that are actually relevant to your branch’s position in the commit history. Whereas before you would have to pull the entire 1GB+ down, it may now ignore 990MB of those changed files on the basis you’re probably only going to need to access the ‘current’ one being worked on for now.

Your git repository can gain speed benefits as a result of this; easier to manage and change without big bulky objects getting in the way, checking out branches and commits will be quicker and take less local space. The tradeoff is in some increased backup complexities (getting your LFS and git remote storage out of sync is a nightmare, but quite possible as they’re two separate services). GitHub and others may try to simplify some of this complexity by appearing to merge the two offerings, but obviously they don’t want this abused either hence the size limits.

Lastly, and once you get familiar with using LFS, just because a file is big, doesn’t necessarily mean it should be in this git and/or lfs repository at all.

Is it a file that can be regenerated from smaller sources? If so, it’s the sources that should be committed and not the end result (probably). Is it something that can or should be packaged separately in its own right and just imported?

If so, learn how to do that with package managers or as an importable asset bundle, then your project would just need a reference to whichever version of the big asset it needs and is compatible with, and once that asset gets updated, you can plan and figure out when to upgrade to it by pulling in the update and adjusting the references.

At least in theory; that stuff works well with libraries in vanilla .net or java development for example (where I mainly get away with not having to use LFS by this method), but may be a bit more complex to achieve well in game engines that can have a lot of meta data dependencies and bindings to the imported assets that are changing and have to be kept track of too. It’s something that I haven’t had a chance to address in Unity yet for example, so I suspect it’s a bit more challenging but may be mistaken.

2 Likes

Yes. I thought Ben was prompting us to prefer that over going through Udemy’s platform.

I put the problem with the missing links on his task list. Thanks for bringing this up. :slight_smile:

At the moment the site that we have here is in a constant state of flux so you can buy the course here.
Ben does intend to have a system eventually as posted on our facebook group that if you have the course on Udemy then you can have it on our platform as well, Its just not quite there yet.
At the moment consider it business as usual with Udemy and i believe there may be a $12.99 sale on but it might be regional.

Hope this clears up a few things on this front :slight_smile:

Indeed the Blender course I’m probably interested in next is not available on your site. Will have to get it via Udemy

1 Like

Thank you sir, that was a deep dive! I think I grasped ~ 90%? :smiley:
What remains my question though is this: if I have my private server, where I can run git (it seems my Ubuntu installation came git-enabled), should I even bother distinguishing between LFS and non-LFS if I’m pushing my local repo to that server? The 1TB+ space that I have there is all mine (theoretically could increase it with larger or more external drives), is there a point for LFS really? Even if I bother with LFS, I would look into hosting my LFS files on that server anyway (why limit myself with GitHub), so does it matter?

Yes I would still look into distinguishing between git and lfs usage; as just pulling and applying changes with bulky big files in a git repository will still hit your productivity and local (non-server) storage.

Whilst it is good to be able to go back in time and pull out older copies of the file, the reality is that you’re not going to need too many of those to refer back to. LFS can help by only providing the version of those files that you need when you need and leaving the rest on your LFS store.

Cheers, this is very helpful. So, just to clarify even more (god this is more complex than I thought), is it the case then that LFS does not keep copies of all versions constantly, are some versions of a large file lost forever then in its development history? I’m a bit confused on that point.

LFS on the server side will retain ALL the history of changes of the file (or rather each copy of it as it is changed), but they won’t all be in your git repository directly wherever the git repo lives (only the pointer or shortcut is held there).

So as you switch branches, make changes, and checkout commits/branches, on your local machine you’ll have whatever is the current working copy of the LFS files you’ve stored, and your git repository will remain compact.

I like Atlassian’s git help pages as they often supplement the descriptions pictorially, and even the descriptions are sometimes well written and easily consumed - if you haven’t already checked them out (or you did but now we’ve had more discussion it may be worth revisiting them again), the links in Nina’s first reply on this thread has them.

Indeed, I found the Atlassian guide very helpful and your reply just now I think 100% solidifies my understanding (finally!)
Many thanks!!

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.

Thanks for this, we’re just loading all our courses onto our own site and will update these links soon.

Privacy & Terms