• The eternal confessions of a beautiful mind...
  • DamianM.Co.UK
  • Home
  • About
  • Archives
  • Contact
  • Sitemap
  • My Flickr

    IMG_9585IMG_9115IMG_9113IMG_9111IMG_9078IMG_9075IMG_9069IMG_9065IMG_9041IMG_9032IMG_8963IMG_8928IMG_8916IMG_8915IMG_8904IMG_8876IMG_8858IMG_8830IMG_8828IMG_8826

  • Recent Posts

    • New Ohio Roller Coaster - INSANE!!!
    • Speeding
    • Captions not required
    • Unfortunate Backgrounds
    • Unfortunate Signs
    • OMG!!
    • Women as explained by engineer
    • The Monty Hall Problem
    • ahhh still love them - Motivational Posters
    • What does Mona Lisa do when the Museum is closed………
    • 21st Century kids books
    • Cool Origami
    • AWESOME Pictures!
    • Things you shouldn’t find in your vegetable patch!
    • World’s Best Graffiti…?
  • My Tools

    • Blog_LinkIt
    • DCoda Theme
    • DCoda Widgets
    • RSS_Sticky
    • WordPress.org
    • WP_BlogNetworking
    • WP_BlogRollSync
    • WP_BoilerPlate
    • WP_Censor
    • WP_ContactMe
    • WP_DeliciousPost
    • WP_EasyReply
    • WP_HeadNFoot
    • WP_LinkIt
    • WP_OneInstall
    • WP_PostDate
    • WP_PostNotes
    • WP_RssSticky
    • WP_Spoiler
    • WP_Submission
  • My Web

    • ASPAlliance
    • ClaimID
    • del.ico.us
    • Digg
    • DSLRBlog
    • DVDProfiler
    • Flickr
    • Honeyed SPAM
    • My Blog
    • My company
    • MYSpace
    • WordPress.org
    • YouTube

    Importing bookmark.htm with Regular Expressions

    The one good thing about Netscape is the bookmark.htm file, if you have ever
    tried to copy all your URL files to a disk you know how much longer it takes,
    and as for uploading a URL file, forget about it.

    Here we will be looking at how to extract the bookmarks from the file.


    Now getting the href from and anchor tag is quite easy. Here we are going to be
    extracting the path to the bookmark, and this makes it a little more difficult.


    The regular expression we use to do this is massive.

    12
    
            Dim r As New System.Text.RegularExpressions.Regex("(HREF=""(?<href>[^""]+)""[\w\W]*?ADD_DATE=""(?<add_date>[^""]+)""[\w\W]*?LAST_VISIT=""(?<last_visit>[^""]+)""[\w\W]*?LAST_MODIFIED=""(?<last_modified>[^""]+)""[\w\W]*?>(?<title>[^<]+)<)|(<H3[\w\W]*?>(?<folder>[^<]+)<)|(</DL>(?<back>[^p]+)p>)")

    However, do not worry; it is actually three straightforward expressions joined
    together. This is an example of using | (or) to select between one of the three
    patterns in which we are interested.



    • an anchor tag, containing information about the bookmark

    • a folder title, indicating entry into a new folder

    • the literal </DL><p>, indicating the end of a folder


    Once you have extracted the bookmarks, you can display them in your own format,
    or go that little bit further are check that none of them are dead links.

    Leave a Reply

    Related Posts from the Past:

    • Import
    • Deleting Messages Code
    • Retrieve Message and Headers Code
    • So a seagull walks into a shop............
    • PHP Tip: Output Control Functions