indie-stats
indie-stats is a Python open source project that will gather mf2 data for IndieWeb domains and generate stats.
Generates a domains.json file for each domain with metadata for the site and it's status - this is needed because quite a few of them are 404 or timeouts.
Each domain is stored as flat-files:
- bear.im
- bear.im.json -- meta-data for the domain
- 20140921T001159_bear.im.json -- data for the domain at the time it was polled, one per poll
Routes
- https://indie-stats.com/domain to claim or mark as excluded your domain.
- https://indie-stats.com/domain?domain=bear.im to view domain details
- https://indie-stats.com/login
- https://indie-stats.com/logout
- https://indie-stats.com/api/v1/domains
Features
- Domain owners can login and claim and/or exclude their domains from being processed
- Crawl IndieWeb domains and store
- mf2 data
- html content
- request and response headers
- Maintain metadata for domains showing their current status
- Domain list is seeded from chat-names
Working On
Storing request and response headers
Generate stats
For each domain crawled the domain, timestamp and data will be passed to a master "cruncher" that will then loop thru a list of stat generating apps. The resulting json blob from this generating app will be added along with namespace and timestamp to the stat history for the domain. Stat items to calculate:
- have a header for auth
- use indieauth as their auth item
- have a header for webmention
- have a header for micropub
- have a h-card
- have a h-entry
Stat retrieval
Add an endpoint to allow for a call to be made for a domain and a date range and the response will be the json blob of stats.