I recently read a tutorial titled: “how to annoy your collaborators: a git CI pipeline for LaTeX” ;)
Don’t put binary files in git
What’s the issue with binaries in git? Just that diff’ing binary files is useless?
They are generally large, uncompressable and replaced instead of updated like text files. All files stay in the repo history forever, they make repos big and slow compared to text files with no advantages provided (e.g. as you said, diffing etc is useless).
If a binary file needs to be stored in git, it’s usually more appropriate to use git LFS for that file. Git LFS stores the binary outside of the repo in the same way that database engines store binary outside of the respective table.
In this case, it would be much smarter to use version control on the text in the document, not tte binary file, which is a feature of essentially every document writer program.
Cool! Good to know.
It’s not ideal, but for a thesis — which ideally has an end date after which it won’t be used — it’s not a huge problem I’d argue.
Git is like shit for Word documents
Unzip the docx with a pre-commit hook
(This is not a serious suggestion)
Still better than using file names.
Why on Earth would you curse yourself with MS Office anyway, especially if writing docs is your professional responsibility?
Why not use Git+Markdown+Pandoc, have your copy, data and layout separate?
I understand that a lot of istitutions/companies impose stylistic/technical requirements for docs and publications, - still doesn’t mean you gotta stay married to the worst tooling.
This is the way.
This is the way.
Why on Earth would you curse yourself with MS Office anyway
idk it says
.docx
in OP’s imageOh sorry, I was too focused on calling out the silliness of the idea.
But better for LaTeX
and then there are fucking PIs insisting on word files who never heard of tracked charges let alone of file naming conventions.
I dunno what a PI is, but my honours thesis supervisor was the person who first introduced me to TeX. And gods, I wish I had known about it earlier in uni, or even back in high school. It is so useful when writing any sort of papers with sections and diagrams and bibliography.
Check out Typst (a newer TeX-like layout engine) if you have time, I’m interested in your opinion. I find it a bit simpler to use than TeX.
Un(?)fortunately I don’t have much cause these days for either TeX or some equivalent to it. Anything I’m writing today is simple enough that it doesn’t need anything more sophisticated than markdown for formatting.
Principal Investigator. It’s the lead scientist in charge of the project.
Then start writing in Markdown. Markdown is easier in syntax, supports LaTeX equations, has metadata and is in plain text so you can use git. And the killer feature is you can use pandoc to convert the markdown file into word, pptx, LaTeX pdfs, html etc. you can also setup a make file that runs pandoc when you ask like this
yeah this is what i used for some projects, i.e. rmarkdown which also integrates the statistics part
.gitattributes can invoke Word on windows to diff versions, and there are plenty of open source scripts that can do it if you don’t have a copy of Word (or Windows) lying around.
But Word is like shit for papers. Use LaTeX instead.
Just like word documents are shit for papers and theses/dissertations it turns out. The formatting alone is a nightmare.
That’s why we wrote our thesis in LaTeX: https://github.com/jonte/GGS-report/blob/a9d9d20bcc22a524629e371ce5984f131490b743/report.lyx#L362
I also have my reports in latex inside a git repo, complete with a makefile to generate graphs from csv containing simulation results. However I am too ashamed to publish the entire version control to a public repo
#LyX 2.0 created this file. For more info see http://www.lyx.org/
Wait, I thought you guys did it manually…
Anyway, I should still learn it.
It’s a editor helping you writing it, you cat still go inside and change things manually if you need/want to do that.
deleted by creator
Between zfs and git, all my important data is versioned.
BTRFS for all us lame folks.
PS Windows pervious versions is actually pretty good, but no one uses it on desktop.
I encountered an engineering firm that did this. I wanted to do it too.
The company I worked for at the time (said engineering firm was doing subcontracting for us) was full of older business people who could never in a million years have wrapped their heads around the idea.
I also met this at a contracting job. Drove me bonkers.
I wrote about half of my thesis in R Markdown using Git to backup my work. It’s fantastic because you can have your plots and statistics integrated directly into your paper and formatting in Markdown is much easier than straight up latex.
R markdown is awesome. I’d always use it for my biostatistics tests and assignments.
Me with Jupyter Notebooks
“Delete this repository” ate my homework.
The weird part is that most modern office software has version control built right in.
And I still do this with all my files anyway.
Use date/time in your file name,using GMT:
Metrics of Sales 2024-05-22_14-29.docx
Very unlikely to have 2 docs with the same down-to-the-minute time stamp in the name.
I generally do this on my NAS, combined with nightly and bi-weekly backups, plus a 6-mo safety backup, to a backup drive. Also, basic off-site nightly backups for important stuff. If I worked on really important stuff that required lots of versioning, though, I’d probably go with a versioning system instead of inserting the date.
Who handles the live replication and offsite storage rotations for your quantum encrypted multi site redundant back up system?
I kid (because your excellent practices put mine to absolute shame). Thanks for the reminder to get serious about backups!
If you think this process involves enough mindpower to check the time, let alone figure out where the dashes are in whatever language keyboard setup I’m using at the time, you are wildly overestimating how much care goes into doing this.
Eh. I think he reffers to auto naming on save with date, not manually
I have an AutoHotkey script that drops the current date in ISO8601. I don’t need timestamps often, so date is sufficient. I like to have manual control of file names since I very frequently do not want files renamed.
Cute related story: I taught my 6 y.o. son this macro so he can save his Krita art with the date (and then some keyboard spam ending in “poop”, usually). The macro shortcut I set is `T so he now calls the date “ticky tee”. Any set of numbers with dashes is a “ticky tee” to him, and if AutoHotkey is closed he runs to get me because “ticky tee isn’t working, Daddy!”
Dammit, why have I never thought to use AHK for this? I already use the custom context menu script someone developed about 15 years ago (Favorite Folders? It’s on the AHK/AutoIT forum) , I can just add it to that.
AHK/AutoIT are game changers. I feel naked on a machine without it, I’m so used to Ctrl-Middle -click to get to all sorts of things… Folders, scripts, tools, automations (like your date idea), etc.
Ticky-tee! Hahahaha, love it!
Well, if you can’t be bothered to ensure file names mean something, then you get to enjoy the results.
In the Real World®, sometimes files get shared and traded around, and conversations happen about them, and you need to be able to quickly verify you’re looking at the same doc.
We can’t all be connected to the same version control system.
Well, if you can’t be bothered to ensure file names mean something, then you get to enjoy the results.
Now you’re getting it.
I’ve had the built in version control do unexpected things, so I play it safe and create named backup files. I usually end up using that one file, but I’ve been saved on occasion
Its just not trustworthy
Latex and git ❤️
I also added a Makefile for mine (LaTeX), and it would add the commit hash to the front page (with an asterisk if the repository had uncommitted changes).
So, if I gave a draft to someone and got feedback, I’d know exactly which revision it was.
Hey, amazing idea, can you share the code?
Sure thing. This also includes the beamer bit which I used for my defense. It’s all pretty hacky but hope it’s useful!
# # Errors aren't handled gracefully (tex doesn't write to stderr, it seems) # If you encounter errors, use "make verbose" # # For small changes (probably those without references), use "make quick" # # Thanks to https://gist.github.com/Miliox/4035649 for dependency outline TEX = pdflatex BTEX = biber MAKE = make -s TEXFLAGS = -halt-on-error # $(MAIN).log is dumb if we have multiple targets! SILENT = > /dev/null || cat $(MAIN).log SILENT_NOER = 2>/dev/null 1>/dev/null EDITOR = vim -p PDFVIEW = evince MAIN = main PRES = presentation ALL = $(MAIN).pdf RECURS = media/ manuscripts/ VERSION := $(shell git rev-parse --short HEAD | cut -c 1-4)$(shell git diff-index --quiet HEAD && (echo -n ' ';git log -1 --format=[%cd]) || (echo -n '* '; date -u '+[%c]')) all: recurs $(ALL) pres: $(PRES).pdf scratch: scratch.pdf scratch.pdf: scratch.tex @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) verbose: SILENT = '' verbose: $(ALL) recurs: $(RECURS) @$(foreach DIR, $(RECURS), \ echo "MAKE (CD) $(CURDIR)/$(DIR)"; \ $(MAKE) -C $(DIR) $(MAKECMDGOALS);) @echo "MAKE (CD) ./" clean: @echo "SH (RM) Not recursing; 'make allclean' to clear generated files." @rm -f *.aux *.log *.out *.pdf *.bbl *.blg *.toc *.lof *.lot *.bcf *.run.xml allclean: recurs @echo "SH (RM) A clean directory is a happy directory" @rm -f *.aux *.log *.out *.pdf *.bbl *.blg *.toc *.lof *.lot *.bcf *.run.xml version: @echo "SH (ver) $(VERSION)" @echo $(VERSION) > VERSION.tex nixpages: main.pdf @echo "PDF (pdftk)" @pdftk main.pdf cat 1 4-end output final.pdf quick: $(MAIN).tex version @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) $(MAIN).pdf: $(MAIN).tex $(MAIN).bbl all.tex tex/abstract.tex tex/intro.tex tex/appendix.tex tex/some_section.tex tex/some_other_section.tex @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) $(MAIN).bbl: $(MAIN).aux @echo "BIB (bib) $(MAIN)" @$(BTEX) $(MAIN) > /dev/null $(MAIN).aux: $(MAIN).tex $(MAIN).bib version @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) $(PRES).pdf: $(PRES).tex $(PRES).bbl tex/beamer*.tex tex/slides/*.tex @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) @echo "TEX (final) $<" @$(TEX) $(TEXFLAGS) $< $(SILENT) $(PRES).bbl: $(PRES).aux @echo "BIB (bib) $(PRES)" @$(BTEX) $(PRES) > /dev/null $(PRES).aux: $(PRES).tex $(MAIN).bib @echo "TEX (draft) $<" @$(TEX) $(TEXFLAGS) --draftmode $< $(SILENT) edit: @echo "EDIT (fork) $(EDITOR)" @$(EDITOR) ./tex/*.tex *.tex view: @echo "VIEW (fork) $(PDFVIEW)" @$(PDFVIEW) $(ALL) $(SILENT_NOER) &
I also had some Makefiles in other directories, e.g., for my
media/
I had:MAKE = make -s RECURS = svgs/ recurs: $(RECURS) @$(foreach DIR, $(RECURS), \ echo "MAKE (CD) $(CURDIR)/$(DIR)"; \ $(MAKE) -C $(DIR) $(MAKECMDGOALS);) @echo "MAKE (CD) $(CURDIR)/" all: recurs clean: allclean: recurs clean
and for
media/svgs/
:SVG_FILES := $(wildcard *.svg) PDFDIR := ./ PDF_FILES := $(patsubst %.svg,$(PDFDIR)/%.pdf,$(SVG_FILES)) all: $(PDF_FILES) clean: @rm -f $(PDF_FILES) @echo "SH (RM) Tidying up derived PDFs" allclean: clean $(PDFDIR)/%.pdf: %.svg @inkscape -T --export-pdf=$@ $< @echo "INK (PDF) $<"
Thank you!!! I’ll see if I manage to make it work for me.
Makefile in other comments. You’ll need something like this on the title page (this assumes you use my Makefile which puts the version in
VERSION.tex
[that’s the literal name of the file, not a placeholder]):{\bf{\color{red}DOCUMENT REVISION:}} {\color{blue}\input{VERSION}}
This is brilliant
git tag "FINAL FINAL FINAL DRAFT - v20"
Counterpoint: advisor said no.
“Just use Word, everyone else does. I have never heard of this latex thing, so must be just some trendy useless overengineered software that does Word’s job but worse. Word can track changes just fine, and you can leave comments.” proceeds to strikethrough, highlight, and inline comment everything instead of using either of those features “I want to read what you wrote, not fight technology” proceeds to email you three separate times after forgetting to attach v28 about how a graphic looks wrong because Word ate it
you can still use word with git. it’s versioning first, diffing and merging only where possible. since you probably won’t branch you won’t need the latter, though.
Missing diffs is a problem, though.
I don’t get how Microsoft owns GitHub yet hasn’t figured out any way to actually create a spec that would be git compatible for Excel, Word, and PowerPoint files yet.
Easy, they want you to buy a onedrive subscription.
Preaching to the choir. “But Box already supports ‘versioning’, why use a confusing hacker tool instead?”
oh I see, you have a shared drive. i assumed you send it around as emails.
A fine assumption given what I wrote. Unfortunately, we did both depending on what he felt like at the time. Yes, for the same doc.
deleted by creator
Dude was shall we say, hands on about certain things. My dissertation is still embargoed because he is paranoid about being scooped. Joke’s on him, everything that hasn’t been published is not exciting enough to meet his own metric for publishability.
While correct in the sense of word and versioning via mail being a nightmare, I really don’t think you can expect anyone to learn latex just so they can comment in your document. I would have offered to send a pdf. Shoot me.
I would have offered to send a pdf
I would have never considered doing anything but sending a PDF. Even if they do know LaTeX. Unless they’re offering to help edit the code for me, what good is it? It’s objectively harder to read than the formatted PDF.
That said, marking up a PDF is much more difficult and does require more specialised software and know-how than editing plain text or even editing a Word document. So there are some advantages to it.
With the Todo package you can easily make online comments what needs to change.
Adding comments to PDFs is actually very easy, is it not? Even that Adobe PDF crapware can do it, you don’t even need a good pdf reader (like Okular from KDE).
This is exactly it. My advisor wanted a word doc to edit, not a PDF. I wasn’t quite snooty enough to think that he should learn latex. Though, if he ever took the time to learn (what time?), I’m sure the writing process would be unbearable for other reasons not entirely related.
I’m going to send you a pdf, you van email me back with the notes or comments in the PDF itself, whatever souts your fancy, and I’ll keep those notes and send you a new PDF with them.
I did this and I had no issues with any of the thesises I have submitted in my bachelors or masters.
First year calculus teacher, thank you SO much for forcing us to write submissions in latex.
Also, overleaf is a thing, this is not like my 1st year of uni, this 11 years later or so. If your fucking professor never heard of latex they are just bad at academia and shouldn’t be teaching honestly. It’s not just about the field knowledge.
That’s assuming they are competent enough to even use a PDF.
I’m going to send you a pdf, you van email me back with the notes or comments in the PDF itself, whatever souts your fancy, and I’ll keep those notes and send you a new PDF with them.
I do this, but from Word.
I learned Latex for my master thesis. Never used it again afterwards, except for my resumé.
: don’t even talk to your advisors, just hand in a finished PDF
Fourth panel from Mark Pilgrim:
- Writing a programming book that typesets your sample code into the book and also runs it to update the sample output shown in the book.