Definitely significant cost in spidering the wild web. There would be significant programming time too. It would seem easy on face value to have a list of URLs and an index of what links link to whom, anchor text and other metadata. I think the big issue is how to determine your crawl lists, what deserves more frequent spidering, how to avoid spider traps, bad .htaccess rules and low quality pages. Ive tried most of the paid link tools sparingly and although they boast a high volume of URLs, some of them havent been crawled in a while.
I guess doing your costing/projections, you could say 'fetching 100 billion URLs a month' would be a good target. A couple of thousand spiders and off you go. Majestic claims to have feched 0.3 billion URLs and found over 3.5 trillion.