My thesis research was dedicated to web content extraction algorithms where I conducted a comprehensive study of existing solutions in this sparse field.
- text extraction (5)
Tag Archives: evaluation
UPDATE 11/6/2011: Added the summary and the results table Lately I’ve been working on evaluating and comparing algorithms, capable of extractinguseful content from arbitrary html documents. Before continuing I encourage you to pass trough some of my previous posts, just to … Continue reading
In my two previous posts (both were issued on hacker news, ReadWriteWeb and O’Reilly Radar) I’ve covered quite a decent array of various text extraction methods and related software. So before reading this one I encourage you to read them to get … Continue reading