Computer Nerd Kev
2023-07-31 04:49:14 UTC
I don't like browsing huge single HTML pages of documentation. Does
anyone know of a program or script (preferably for Linux) that can
scan a big software manual's single HTML page and automatically
break it up according to the contents section and the corresponding
anchor links?
Basically I want something to turn this:
http://www.gnu.org/software/coreutils/manual/coreutils.html
into this:
http://www.gnu.org/software/coreutils/manual/html_node/index.html
But without the Texinfo source like GNU software (usually) uses.
Just from the HTML itself. I also want it to output static HTML, so
no solutions using Javascript or browser add-ons.
One option might be to use csplit to break it up at common section
separator patterns, then a simple script renames the new files
according to their heading text. But I'd like to have HTML
navigation links, ideally including converting existing anchor
links inside the document.
A prime target would be the Raspberry Pi configuration
documentation, which has convinced me of the merit of multi-page
docs by how confusing it has become for me since they switched to a
single-page layout:
https://www.raspberrypi.com/documentation/computers/configuration.html
anyone know of a program or script (preferably for Linux) that can
scan a big software manual's single HTML page and automatically
break it up according to the contents section and the corresponding
anchor links?
Basically I want something to turn this:
http://www.gnu.org/software/coreutils/manual/coreutils.html
into this:
http://www.gnu.org/software/coreutils/manual/html_node/index.html
But without the Texinfo source like GNU software (usually) uses.
Just from the HTML itself. I also want it to output static HTML, so
no solutions using Javascript or browser add-ons.
One option might be to use csplit to break it up at common section
separator patterns, then a simple script renames the new files
according to their heading text. But I'd like to have HTML
navigation links, ideally including converting existing anchor
links inside the document.
A prime target would be the Raspberry Pi configuration
documentation, which has convinced me of the merit of multi-page
docs by how confusing it has become for me since they switched to a
single-page layout:
https://www.raspberrypi.com/documentation/computers/configuration.html
--
__ __
#_ < |\| |< _#
__ __
#_ < |\| |< _#