How to organize PDF and HTML files into folders so that the interesting one(s) to read can be found when needed.
The Challenge :
Nowadays we just Google whenever we need information about a subject.
Introductory information can always be found on Wikipedia.
But what to do with scattered off-line information kept in local PDF and HTML files.
Why bother :
Older references once found and read are often removed from the WEB after some time. I am a person who wants to be able to re-read older references to help my memory so I need an off-line system.
And once the number of off-line files passes ~100 then they must be organized in order to be handy. I am way past that limit.
OneNote 2016 :
Other concepts like OneNote are also very useful.
I use OneNote 2016 as the first stop to manage shorter texts typically captured from the WEB, but I find OneNote less suited for longer texts which I prefer to have as a PDF file.
Using the old desktop version has several benefits :
- It allows you to do proper backups. Read the horror story about having no backup when the Microsoft OneNote server mess-up : https://community.spiceworks.com/topic/2246511-onenote-has-a-dark-side-stop-using-onenote-until-you-read-this
- It allows you to export the full Notebook or just a Section or just a Page to a PDF file which may include automatically generated Bookmarks after some unknown rule.
As a OneNote page can be arbitrary long then converting something from OneNote to a PDF file will introduce arbitrary page shifts which will not always look good. Fortunately one can load the PDF file into Word 2016 which can be used to do layout corrections and add any missing Bookmarks and save the new content as a PDF file.
- The continuously updating OneNote iPad app handles the older OneNote 2016 files.
This is an Age Old Subject. So what new twists can possibly be added to this subject.
To get started, I will just refer to what others has written, so here is a nice text with a nice text layout :
The Folder and Files Organization Objective :
What they advise to do makes sense. What they end up suggesting is a ( very ) deeply nested folder structure in which each folder level gets more specific/detailed.
My PDF documents are not sharply dividable into just one specific topic so creating a deeply nested folder structure actually adds confusion instead of enlightening.
So in my case I will add additional constraints :
- I don’t want endlessly deep folder nesting. I limit nesting to max two folder overview levels and one level with files.
- I don’t want folders including only one or two files.
- I don’t want folders including 100’s of files.
- Everything should look the same whether viewed on a PC or viewed on an iPad.
What I have come up with is :
- A file naming template.
- A concept of how to use folder names as index cards into the content of a folder.
- Use a special character ¤ to separate items when listed on the same line ( file name or folder name ).
The funny character ¤ is a classic character called “generic currency” character and it is included even in modern character sets, like ISO/IEC 8859. Texts including this character looks the same on both a Windows PC and on an iPad. For some strange reason it is never used for anything, but as it visually looks good it is the perfect item separator character when making a filename which should include various information.
The File Name Template :
I have therefore created my own file naming template using the ¤ character and some spacing surrounding it as :
Title & Version ¤ Year ¤ Pages ¤ Writer ¤ Publisher
This format is sufficiently general to handle PDF files as well as HTML files.
PDF and HTML are the two basic file formats I use for documentation. Other documents like Word files or Excel files is converted to PDF as the information is not supposed to be live documents to be modified.
Sometimes HTML content can be converted to PDF too, but HTML pages usually don’t have a layout intended to be separated into pages so converting those files to PDF makes for a poor reading experience. So I prefer to keep HTML content as a HTML file. There is one noteworthy exception here. It is possible to capture a WEB page as an image and convert that image to a single page PDF file. See the Firefox plug-in : FireShot.
Title is obvious.
Version could be any brief marking, like v2 or 2nd. Don’t waste many character on that.
Year is obvious. Use 1978 and not just 78. I considered putting year first as it indicates the relevance of a file. Something written in 1930 is probably not as up-to-date and relevant as something written in 2010. But in the end and because having a lot of texts without date stamping I settled for making it entry two.
Pages is obvious. It conveniently indicates the complexity of the file. Something 3 pages long is probably not as comprehensive as something 300 pages long. HTML files are not separated into pages so there is no length info for such a file.
Writer is obvious. I don’t always include the Writer as I don’t know the writer anyway.
Publisher is obvious. A book like text may include a Publisher. But it is more relevant for WEB content where naming the company publishing the information could be important.
With all that information to be included in a file name it is important to be as brief as possible. Ignoring the Writer and/or Publisher when necessary. Keeping the length shorter than ~ 100 characters is fine.
Using Folders as Index Cards to Information in a Folder :
It is important to notice that an iPad list folders mixed with files when listed alphabetically. This is an Apple thing and in my opinion quite stupid. I want folders naturally listed in front of files so something must be done to ensure that the iPad also list folders ahead of files. A little file naming ingenuity makes it possible.
Folders at the Highest nesting level :
At the highest folder nesting level there are only folders. Below is a listing of the first entries. The full list includes about 100 folders which is the maximum number of folders I want to to scroll through. The content is sometimes re-arranged to keep that limit :
A ¤=¤ Audio - .Acoustics ¤ Audio - .Brüel & Kjær ¤ Audio - .Engineering ¤ Audio - Amplifiers ¤ Audio - Driver Units, Crossover and more Audio - HI-FI ¤ Audio - Media and Music and more Audio - Reviews Audio - Software Audio - Speakers ¤ B ¤=¤ Basic - LabVIEW and the Nat. Inst. world -¤- 27 Basic - MATLAB and Octave ¤ Basic - Python -¤- 10 C ¤=¤ CAD - Electrical ¤ CAD - Math ¤ CAD - Mechanical ¤ D ¤=¤ DSP - .Analog Devices ¤ DSP - Basic Concepts ¤ DSP - Converter Principles ¤ DSP - Digital Communication ¤ DSP - Digital Filters ¤
I try to group content into a few all-encompassing subjects, like Audio, Basic ( important knowledge ), CAD, DSP and so on. I have about 15 such subjects. But it changes as I sometimes re-group the content within the subjects.
My lastest change was the introduction of the Mechatronics subject, which is an important engineering container concept including items from control theory, DSP, mechanical analogies and other sub-subjects. Which may introduce the need for cross subject links.
The listing includes these special attributes :
A ¤=¤ Which is a visual separator ( empty folder ) between each subject. I try to use a single character for clarity.
Audio – .Acoustics ¤ Where Audio is the subject. The dot or point in .Acoustics is used for sorting, forcing Acoustics to be listed first, as well as to indicate this is an important item within this subject. The final ¤ character indicates that there is a organized sub folder here including files for the subject.
Audio – Reviews The absence of the ¤ character indicates that there is an un-organized sub folder here including a mess of files.
Basic – LabVIEW and the National Instruments world -¤- 27 The presence of the -¤- 27 characters indicates there is an organized nested sub folder including 27 folders. This concept is a way to avoid having too many folders at the highest nesting level.
Sub folders at the next highest nesting level :
The next highest nesting level may contain :
- Either files for the subject.
- Or it may contain more folders related to the subject to avoid having too many folders at the highest nesting level.
Basic – LabVIEW and the National Instruments world -¤- 27 indicated 27 sub folders. This is a folder listing :
A - Language, Classic and NXG ¤ B - Actor Framework and OOP ¤ C - Project, SVN, EXE-built and more ¤ D - Digital Signal Processing ¤ E - MathScript, Matlab and HiQ ¤ F - Vision ¤ G - SQL ¤ H - Remote Panels and Computing ¤ I - Sim and Control, Simulink, PID and Fuzzy Logic ¤ J - Kalman Filtering ¤ K - Test Automation ¤ L - Python and LabPython ¤ M - Assorted ¤ N - Toolkits ¤ O - References ¤ P - NI-DAQ, VISA, PXI and more ¤ Q - DLL, CIN and more ¤ R - DMA, Buffers and more ¤ S - myRIO and CompactRIO ¤ T - ELVIS ( look in the Mechatronics chapter ) U - Hardware and more ¤ V - CVI aka LabWindows Language ¤ W - ComponentWorks ¤ X - Measurement Studio ¤ Y - ActiveX, ATL, COM and OLE ¤ Z - Multisim ¤ Ø - Newsletter and more ¤
I listed the full content to show that it may be convenient to add a character is front, like A – in order to control what is listing first ( most often used or most relevant or whatever ).
But the folder could also have looked like the folder listing shown for the highest nesting level. The only rule here is that it should look good to the reader ( me ).
Sub folders at the lowest nesting level :
The lowest nesting level ( either second or third ) holds the actual files related to the subject. Here is a listing :
-1 = Introductions and Tutorials -3 = Comprehensive Texts -4 = Fluffy Texts -5 = Comprehensive Documentation -6 = LabVIEW Technical Resource #1 #1 = Documentation Resources Index ¤ 2018.html #1 = Get Start with LabVIEW ¤ 2013 ¤ 89p ¤ NI.pdf #1 = Introduction to LabVIEW ¤ 2016 ¤ 71p.pptx #1 = LabVIEW Fundamentals ¤ 2005 ¤ 165p ¤ NI.pdf #1 = Tips Labview Development ¤ 2007 ¤ 39p.pdf #3 #3 = LabVIEW - User Manual ¤ 2003 ¤ 349p ¤ NI.pdf #4 #4 = Best Pract. for BDs and FPs ¤ 2011 ¤ 115p.pdf #4 = GPOWER XNodes and VIMs ¤ 2016 ¤ 33p.pdf #4 = SW Eng Tools with LabVIEW - Hands On ¤ 43p.pdf #4 = LabVIEW - Dev Guidelines ¤ 2003 ¤ 97p ¤ NI.pdf #4 = LabVIEW - Meas. Manual ¤ 2000 ¤ 358p ¤ NI.pdf #4 = LabVIEW - Meas. Manual ¤ 2003 ¤ 159p ¤ NI.pdf #4 = LabVIEW Graph Dev - Hands On ¤ 2006 ¤ 126p.pdf #4 = What is LV used for ¤ ViewPoint Systems.html #5 #5 = G Prog Reference Manual ¤ 1998 ¤ 667p ¤ NI.pdf #5 = Func and VI Ref Manual ¤ 1999 ¤ 609p ¤ NI.pdf #5 = LabVIEW Version 5.1 Addendum ¤ 1999 ¤ 108p.pdf #5 = The LabVIEW Style Book ¤ 363p.pdf #6 #6 = LabVIEW Technical Resource 1996 Q3 ¤ 24p.pdf #6 = LabVIEW Technical Resource 1999 Q3 ¤ 8p.pdf #6- = Tech. Res. Introduces Bundled Value Packs.pdf
The listing include both folders ( in bold ) and files ( in blue-ish ). ( I have shortened some file names to avoid line wrap-around in this post ).
The listing also include these special attributes :
#1 Which is a visual separator ( empty file ) between each group. I try to use a single character for clarity.
-1 The minus sign preceding the number or character is important as it controls what an iPad lists first. So the folder names starts with this character to ensure they are listed first.
The folders shown first are empty and are only used as a convenient Index Card content overview of the files.
Both the iPad and a computer indicates folders with one type of icon and files with other types of icons which adds to the ease of content overview.
The first six numbers ( 0 to 5 ) shown in the folder names are reserved to always read this ( when included ) :
-0 = Recommended Texts -1 = Introductions and Tutorials -2 = Brief Concise Texts -3 = Comprehensive Texts -4 = Fluffy Texts -5 = Comprehensive Documentation
The recommended text is listed first. The other texts are listed in “heavy” order.
The remaining numbers ( 6 to 9 ) and characters ( A to Y ) can be included as needed. Z has a special meaning. It is listed last and indicates that the subject includes one or more zip files that may be convenient to have here :
-Z = ZIPs and more
This concludes the description of my preferred off-line file organization. The basic idea is to present the files attractively in my preferred style. It requires some discipline to maintain but as long as the organization can be done on a PC ( using Total Commander ) it is manageable.
Take notice of the use of folders as a Content or Index Card listing giving a quick impression of the files content within a subject folder. That concept can be tweaked as desired.