Link pop over on-page definitely an important concern, as links are ultimately harder to manipulate.
If I were developing a search engine, I'd go also reference for meta keywords and check each one is listed on the page. Those same keywords must also be present in the meta-description in the form of a sentence, not simply keywords stuffed.
The fact that meta-tags are so little regarded these days means that they are possibly ripe for referencing in a relevancy algo.
If you could get it to read CSS and/or JavaScript, not only would it offer more useful results, but the technology would almost certainly be a good investment for other search engines.

(Presuming they haven't already got the tech, but simply cannot allocate resources to use fully.)
Title tag as having a set character limit - weight keywords as a percentage factor of their presence in the title, ie:
1 <title>Britecorp: Internet Marketing</title>
2 <title>Britecorp: Internet marketing, SEO, link building, PPC, etc</title>
1 as being more relevant for "internet" "marketing" & "internet marketing", because more focussed on keywords - title keyword density, I guess.
Also, h tags score in the same manner. Only first h1 tag of a page as worth anything, and a character limit to it as above.
Maybe a small score for a h2 tag, and maybe a small score for on keyword density of <p> tags - with anything higher than around 5%+ flagging filters for overstuffing (ie, negative scoring).
Devalue links after first 20 from any single IP block.
Devalue links that all contain the exact same link text and/or keywords.
Do topic clusters and related topic relationships, if possible.
Ah - link has most value if: anchor text includes keywords that relate to subject of page linked
from as well as the page linked to. So a link from a webmaster site to another webmaster site, has more value than a link from a ****** site to a webmaster site.
Spot FFA lists and *ignore* all links - blacklist entire domain for link value (but not ranking value)
*Randomise* the ranking results somewhat. Older sites naturally have better linkage, but not necessarily better content. Help mix the list by making ranks *approximate* rather than definite.
Devalue *pages* containing any form of affiliate link or modified link (ie, /link.php?ID=2323232).
Make sure the results are entirely non-biased - if sevenseek as a search engine calls up only v7 pages for a search of "web hosting" you risk credibility, and your investment in sevenseek.
Have some degree of human editing - but *only* for clearly marked offences - hidden text, definite cloaking, doorways, etc.
2c for now.
