2013/HTML is your Data
HTML is your data was a brainstorming session during IndieWebCamp 2013.
Notes archived from https://etherpad.mozilla.org/indiewebcamp-htmldata β probably need cleaning --Waterpigs.co.uk 10:31, 24 June 2013 (PDT)
HTML is your data
2013-06-22 17:00-17:15
- htmldata
- indiewebcamp
- Use cases for web data
- What HTML can we use for them
- What are the issues we run into?
Use-case: Feeds (what used to be RSS)
- HTML: h-entry (individual posts, home pages as feeds)
- Issue: last-modified. with RSS/Atom you can send if-modified-since and get back a 304 with no body, and not have to parse, or do any other tests
Use-case: authentication discovery
- RelMeAuth / IndieAuth rel-me to an OAuth provider profile URL
- Issue: providers dropping rel-me support (FB), or wrapping with their own URLs (Twitter tco)
Use-case: Read-it-later -> Pocket / Instapaper / Readability
- HTML: hAtom / hNews - consumed by Readability
Use-case: social-graph - identity clustering
- rel-me symmetric link graph - supported by identengine.com (consumes)
Use-case: n-factor-auth
- RelMeAuth with requiring multiple OAuth providers
Use-case: home page security (Aside: HTTPS vs HTTP)
- https on your personal domain - WillNorris.com does this, is all https
- SSL with SNI (extension to TLS which allows the client to specify which hostname they're trying to connect to, so the server can provide a certificate custom to that hostname). This is the only way we're going to have tons of secure personal sites on shared hosting (same IP address).
- cheap web hosting is typically shared hosting (share IP address with strangers)
- or you host multiple people on your domain, e.g. given.familyname.org
- issue: SNI doesn't work on <= IE8 or IE9/XP. Need IE9+ not on XP.
- issue: lack of support in client libraries, e.g. Haskell library only recently added support for SNI. Janrain hasn't updated their Haskell library yet. So can't log-in using any Janrain OpenID service.
- SSL with SNI (extension to TLS which allows the client to specify which hostname they're trying to connect to, so the server can provide a certificate custom to that hostname). This is the only way we're going to have tons of secure personal sites on shared hosting (same IP address).
- support document for Persona has to be on https
Use-case: scientific paper citations
Use-case: streams of physical measurement data from new devices
- trend: accept a REST API, and provide JSON
- opportunity: this is how you should be providing a stream of data from devices
- microformats2 library available for converting HTML with microformats to canonical JSON representation
- How to return JSON version of your web page?
- Example: add ".json" to any URL on aaronparecki.com
- What about HTTP ACCEPT header? application/json - CONNEG
- Do both?
Use-case: email address based discovery
- from alice@example.com ->
- WF: example.com/.well-known/webfinger/?resource=acct:alice@example.com
- that returns JSON
- what does this tell you?
- rel=feed - URL
- rel=openid.server - URL
- but it could return HTML+microformats
- that returns JSON
- could eliminate with:
- just put the rel values on example.com
- could eliminate the well known URI with a new rel value
- rel="rels" href="rels.html"
- rels.html has all the same rel/URL pairs as your WebFinger JSON
- rel="rels" href="rels.html"
Use-case: backups from web applications
Use-case: exports from web sites/silos/application Examples:
- Google Takeout - G+ export has HTML+microformats
- Facebook export has HTML+microformats
- Twitter archive has JSON that it renders clientside HTML+microformats
Use-case: import an export into your own site
- i.e. transitioning from sharecropping to owningyourdata
Use-case: outlines
Use-case: rating information
- HTML: h-review
Use-case: self-tracking
- http://aaronparecki.com/presentations/2013/06/19/1/low-friction-personal-data-collection
- has markup examples of his own self-tracking data
- sleep example: http://aaronparecki.com/metrics/2013/06/22/061515/
- sleep example: http://aaronparecki.com/metrics/2013/06/22/061515/.json
Use-case: one endpoint for robot clients and human clients
- ACCEPT: mime multipart
- what would a browser do if you sent it a MIME multipart response?
General Issues
- Built-in HTML+microformats2 parsing?
- PHP - use php-mf2 library
- Ruby - use microformats2 gem
- Node/JS - microformats.nodeon GitHub
- Python - Randall Leeds - volunteers to write one!
- Parsing speed at scale of HTML data
- use a proxy like pin13.net
- "But what if I don't want to display the data on my web page?"