Recently I’ve been playing around with a webbrowser control to automate my interaction with a website for testing purposes. I am using C# and DOT NET.

I found it a bit difficult to change the value of an HTML combobox (i.e. dropdown), but as it turns out it’s rather easy.I used the following code to change the value of a combobox named: “test” to the value 12.

foreach (HtmlElement el in webBrowser1.Document.GetElementsByTagName("select"))
 {
 if (el.Name == "test")
 {
 foreach (HtmlElement comboItem in el.Children)
 {
 Console.WriteLine(comboItem.InnerText + " " + comboItem.GetAttribute("Selected"));
if (comboItem.InnerText == "12")
 {
 comboItem.SetAttribute("Selected", "True");
 }
 }
 }
}

This code was used for the following HTML:

<select name="test">
  <option value="10" SELECTED>10&</option>
  <option value="11">11</option>
  <option value="12">12</option>
  <option value="13">13</option>
  <option value="14">14</option>
</select>

Finally the following code was required to press the “Submit” button:

foreach (HtmlElement el in webBrowser1.Document.GetElementsByTagName("input"))
{
if (el.Name.ToLower() == "submit")
{
el.InvokeMember("click");
}
}

I have a Windows XP installation on a VMWare hard disk. Today I tried to boot it, but… OOPS (no.. I don’t mean Object Oriented Programming and Systems… I mean..crap!). It seems I forgot the password of the installation. So a little adventure started…

1. After a bit of research I found out that there is ophcrack. I downloaded the live CD as an ISO image and set VMWare to load that CD.

2. When VMWare starts and before windows starts booting I clicked on the VMWare screen and pressed ESC. This gives me the menu to select the device I want to use to boot.

3. I chose to boot from the CD.

4. The ophcrack live CD starts loading, but when it finishes I get a:  “No partition containing hashes found” error.

5. The problem seems to be that the Windows installation is on a SCSI virtual disk that is not recognised by this distribution of linux. Tried “fdisk -l” on a terminal from within the ophcrack live CD and it didn’t return any results.  To be able to crack the password I need to have access to the “WIndows/System32/config/ folder of my virtual hard disk. So…

6.  I created a second virtual hard disk in the same VMWare virtual machine. I downloaded an ISO image of Ubuntu

7. Installed Ubuntu on the newly created hard disk.

8. Boot using Ubuntu. Ubuntu was able to access the virtual hard disk of the windows installation. I copied the folder “WIndows/System32/config/” on my local Windows 7 installation.

9. Downloaded ophcrack for Windows and installed it on my Windows 7. Also downloaded the XP Free Small Table.

10. Launched ophcrack and clicked on “Tables”->Install and selected the folder where I had downloaded the XP Free Small Table file (if it is a zip file you need to unzip it).

11. Select Load->Encypted SAM and select the “config” folder I had copied from the VMWare installation (through Ubuntu).

12. Got my password in 45 seconds!!

I have developed a component in Java that requires an HTML parser. The component goes through around 2000 webpages and gets some data.

It was quite easy to implement it using the org.htmlParser (http://htmlparser.sourceforge.net/). Even though some of the webpages are quite big (some of a size of up to a few hunders of MBs) the memory of the component seemed to grow inexplicably leading to a Java heap out of memory error. I spent a good deal of time trying to figure out the source of the leak thinking it was my code. After a few attempts to identify the problem, I used the IMB Support Assistant workbench and took a heap dump using the command:

jmap -dump:format=b,file=heap.bin processID

I was able to identify a lot of Finalizer objects referencing the org.htmlParser.lexer. This looks like a memory leak, where the garbage collector can’t collect the objects properly?

Well.. the fact of the matter is I haven’t spent an enormous amount of time reading the documentation and/or source code of the project.  It seems there is a close() method that can be called on the Page reference of the lexer and I haven’t used it. So, at the end of my method that does the parsing I added:

parser.getLexer().getPage().close();
parser.setInputHTML("");

The first statement closes the Page object. I added the second statement just to be on the safe side, even though it’s probably redundant.

And the “Memory Leak” seems to have vanished…

I keep this blog (which I don’t update very often, but that’s another story) and I post both personal and professional (in the sense that it will appreciated by Computer Scientists) content. I would like to import all blog posts automatically from this blog into my facebook and linkedin profiles.

The thing is that I don’t want to post Computer Science related content into facebook and I don’t want to post into my (professional) linkedin profile all kinds of rubbish. I found out that it is easy to selectively syndicate content depending on the tags that I will add to my post.

So I added two tags, an import_facebook for posts that I want to be imported into my facebook profile and an import_linkedin tag for posts that I want to be imported into my linedin profile.

Then I installed the “WordPress” application for linkedin and I set as a feed url of my blog the following: http://kyriakos.anastasakis.net/tag/import_linkedin

I also installed RSS Graffiti for facebook and I set as a url of my feed the following: http://kyriakos.anastasakis.net/tag/import_facebook/feed

From now on any WordPress post I tag as “import_facebook” will be imported into my facebook profile, while every WordPress post I tag as “import_linkedin” will be imported into my linkedin profile. Before installing the WordPress and the RSS Grafiti apps on your linkedin and facebook profiled respectively, you need to have at least one post tagged “import_facebook” and a post tagged as “import_linkedin” for the applications to pick up the links properly.