summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
6 daysfeat: upgrade hamlet and use newer stuff from theremasterdemo
6 daysfeat: use hamlet package to simplify command-line argumentsdemo
6 dayschore: install hamletdemo
8 daysfeat: specify whether URL sources are both missing or both presentdemo
8 daysfeat: implement shortcode featuredemo
8 daysrefactor: move initial URL parsing into function 'convertToURL'demo
8 dayschore: add urls.csv shortcodes filedemo
8 daysdocs: add some commentsdemo
8 daysfeat: add a user agent header!demo
8 daysfeat: add "total entries" as part of XML commentdemo
8 daysfeat: prettify stats when figure was passed in as 0demo
8 daysfeat: place comment before URL listingdemo
8 daysfeat: add comment logging maxDepth and maxURLs inside xml outputdemo
8 dayschore: ignore xml outputdemo
8 daysfeat: save sitemap to a filedemo
8 daysfeat: check for missing https://demo
8 daysfeat: add header to xml outputdemo
8 dayswip: generate rough draft of sitemapdemo
8 daysdocs: expand findURLs godocdemo
8 daysdocs: add comment explaining purpose of log.Lshortfiledemo
8 daysrefactor: move html document creation to getBatchdemo
8 dayschore: include shortfile printout in log invocationsdemo
9 daysfix: make select statement block unless communication takes placedemo
9 daysfeat: configure maxDepth from the command linedemo
9 dayswip: prototype a max-depth limitationdemo
9 daysfeat: update the classic crawler to track depth via packetsdemo
9 daysrefactor: move packet definitions to their own filedemo
9 daysrefactor: move "packet conversion" into a separate functiondemo
10 daysdocs: add extensive commentsdemo
10 daysfeat: measure the depth where each URL is founddemo
10 daysfeat: add some prints to prove we need to select on Done()demo
10 daysrefactor: eliminate redundant select statementdemo
10 daysfix: make sure all workers terminate by the enddemo
10 daysfeat: add early termination condition based on maxURLsdemo
10 daysfeat: add break condition from worklist loopdemo
10 daysfeat: add the worker-pool-based crawer from TGPLdemo
10 daysdocs: add an "Awesome Go" section to the READMEdemo
10 daysdocs: save websites I usually use with this crawlerdemo
10 daysfix: release semaphore at the proper timedemo
10 daysfeat: implement maxConcurrency using a buffered channel 'sema'demo
10 daysfeat: add cancellation featuredemo
10 daysfeat: hit 'em with the classic web crawlerdemo
10 daysrefactor: make deduplication part of main goroutinedemo
10 daysfeat: restore original "print 45 and hang" behaviordemo
10 daysfeat: change all channel payloads to pointer typesdemo
10 daysfeat: reveal bug in the channel linkage topologydemo
10 daysfeat: add some code to canceldemo
10 daysrefactor: use SplitSeq instead of Splitdemo
10 daysfeat: add gouroutine-leak profilingdemo
10 daysfeat: design worker-pool webcrawlerdemo