Removing .html extensions from links on export using an export script

Hello Bootstrap folks!

I have a published site on my own hosting, when I export my site and upload it to my server I have to then edit all the individual files and remove the .html extensions from the links to other pages as my .htaccess file supports hiding these from my sites URL.

for example, mydomain.com/index.html becomes: mydomain.com/index. If a link to another page has the .html extension then that will be displayed in the URL once a new page is visited.

I am still getting to grips with the export script feature and was wondering if anyone would be able to suggest how I may program a script that will go into each .html file and then remove the .html extensions from my links?

Thank you very much!

You would need to use a script that runs a html parser. I would recommend Python Beautiful Soup but be prepared to learn. Once you have that succeeded you can basically modify any html file and its contents - removing, adding, or changing anything in the file, along with saving with a new name and/or extension.

Welcome to the forums !!!

You can easily do this in Notepad ++.

open Notepad ++

ctrl shift f

Search for .html

Replace with 'leave empty'

Folder your website folder

then click on replace in files

Thank you Twinstream

I will attempt to look at Beautiful Soup and see if I can work it out, I just very unfortunately am not that confident programming but python looks straight forward enough.

kuligaposten thanks, but I am specifically looking for a way to automate this process as I have over 20 pages for my site and this takes too long to do manually now.

In notepad++ you record the commands above as a macro, save the macro with your fav shortcut. the export script in bss YourPathTo\notepad++.exe. when notpad++ opens after your export you just press your saved macro shortcut and it's done in all your 20 pages. That's not that much manually work I think

I just did a google search to find out if there were easier or better ways to show website pages without extensions. This link looks pretty simple to me and would probably save you a lot of hassle. Hope it helps and if not just ignore it :) https://www.youtube.com/watch?v=-6LyG9I-FPc

@Jo

That's probably exactly what he have in his .htaccess but that doesn't strip the .html from a link. What it does is if you have a link mydomain.com/myPage then the server first look if it is a directory if there is a directory myPage the server point you to the directory myPage. If there is not a directory myPage the server check if there is a myPage.html and show you that file from the link mydomain.com/myPage If the link is mydomain.com/myPage.html then the server just show you the link mydomain.com/myPage.html because then there is nothing to rewrite for the server

Hmm I watched that video and it sure looked like it was helping you set up pages to show without the .html on the end to me. Guess I misunderstood it? Oh well won't be the first time I'm sure :P

Did you ever figure out an export script for this? I have no idea what I’m doing with them, so if there’s anything you can share, that would be super useful.

kuligaposten’s suggestion worked best for me: Removing .html extensions from links on export using an export script - #5 by kuligaposten

1 Like