2 captures
03 Oct 2003 - 24 Aug 2004
Sep OCT Nov
03
2002 2003 2004
success
fail

About this capture

COLLECTED BY

Organization: Alexa Crawls

Starting in 1996, Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the Wayback Machine after an embargo period.

Collection: alexa_dt

this data is currently not publicly accessible.
TIMESTAMPS

The Wayback Machine - http://web.archive.org/web/20031003002213/http://www.gotdotnet.com:80/team/dbox/default.aspx?month=2003-06
 

Don Box's Spoutlet

Undoing four years of XSLT-inflicted damage inside the big house

June, 2003

Done in Dallas

This was my second TechEd since taking the red pill.

 

This year was considerably harder than last year.  Last week, I was heads-down coding every day on new stuff. Couple the two-day hiccup in my work with the fact that the weather in Seattle has been convertible-friendly, and it took a lot of will to get on the plane and go to Texas.

 

Here's the post-mortem on my trip so far:

 

Sunday

1) I gave a small talk on WS futures to the MSFT Regional Directors (RDs).

2) Yasser and I did a short one-hour Q&A session as part of a larger pre-conference event.

3) I hosted a panel with these guys. No Sells, but Yasser (the latest "legend") was in attendance. One thing was clear from #2 and #3 was that people want deep XML support from VS.NET - it's nice to know I'm not alone.

 

Monday

1) I gave a broad WS talk in the arena. This was the big talk for me, as it was in the big room and there are expectations.  I did most of the talk in raw XML, toggling between IE, EMACS, and Office 2003.  I'll post the main message of the talk later.

 

2) I hosted a small panel of WS friends (SteveSw, YasserS, and Clemens). The highlight was easily Steve's answer to the "is COM dead" question. To paraphrase Steve: "Like humans, technologies stop growing in size as they age. Also like humans, once a technology reaches the age where growth stops, the culture cares less about them - they're somehow less interesting."

 

I also had the treat of seeing fellow blogger Fumiaki Yoshimatsu (a.k.a., Centaur's Identity). In case you hadn't noticed, his blog moved recently. Re-subscribed.

MSDN TV

MyMSDN TV spot just went live. 

 

I had forgotten how hard it is to do pre-recorded talks. Hopefully the next one will be easier.

Wow, this nails why I've stopped reading most weblogs

This may have already made the rounds, but it really captures (for me) why I stopped reading lots of weblogs.

 

Chris Brumme's has been nice, however. I especially love the notion of marshal-by-bleed.

Jan Gray's Performance Piece is on MSDN

Have at it!

Cool IE feature

I've been spelunking through URLMON land lately. Pretty interesting stuff (we'll see how well COM interop and IE get along next week).

 

The coolest thing I found was the view-source: scheme that URLMON recognizes.

 

If you are an IE user, type "view-source:http://www.gotdotnet.com/team/dbox/default.aspx" into the address bar to see my site's HTML representation.

A data model for log entries

Sam just launched a WIKI to capture what exactly we should be slinging around in the N-way web.

 

This should be fun.

If your Funky your Valid, clap your hands

According to Phil, I'm funky.

 

According to Sam and Mark, I'm valid.

 

Valid and funky is good. 

 

If you don't want funkiness, there's little reason to use XML.

XML eclipses COM

Dilip Kumar just sent me this link that tries to taxonomize the XML landscape

 

It really drives home how complex XML development can be once you try to adopt the appropriate standard for the task at hand.

 

I recently quipped during a keynote that XML in its entirety had become more complex than COM ever was. Even though by the end of the 1990's pretty much any API Microsoft produced was COM-based, there was a very tiny kernel of interfaces that you could build your own personal empire with while safely ignoring the rest.  Here's the list as I remember it:

 

IUnknown

Needed to make object references work.

IClassFactory

Needed to give the loader a hook into your code for object creation.

IDispatch

Needed to integrate with scripting/late-binding.

IMarshal

Needed to play games with parameter marshaling.

IMoniker

Needed to play games with object naming (e.g., VB's GetObject).

 

Note that of these five interfaces, only the first two were strictly necessary (although IDispatch was damn near mandatory given classic ASP).

 

Even if you take the transitive closure over these interfaces (e.g., which picks up the system-implemented interfaces IBindCtx, ITypeInfo, ITypeLib, IStream), the list of interfaces you were expected to implement was actually quite small.  Where COM got nasty for most developers was in the details - because COM sat outside of your programming environment, you had to do a non-trivial amount of manual labor, all of which was extremely low-level (either memory management or thread management).

 

And so now, we have XML. Here's my rough pass at the "kernel" I couldn't imagine living without (and why).

 

URI

XML relies on the URI value space for identifiers/references/queries.  It's the lone hard dependency on the old-world web (OWW). It's also the bridge from the old world of centralized protocol standardization (try to register a new scheme with IANA) to the new world of ad hoc decentralized authorities (that DNS name you squated on in the 1990's makes you a first-class banana republic).

UTF-8

XML did a great job of delegating the octet->character mapping to Unicode which allowed XML to work exclusively in terms of abstract Unicode code points.

XML 1.0 + Namespaces in XML

These two specs are pretty much indivisible at this point - between the two of them we get a syntax over Unicode character sequences that gets people out of the lexx business.

XML Schema Part II

Reaching agreement on type definitions is incredibly hard (and often impossible). XSD Part II gives the world a set of common datatypes that, despite its warts, people seem to feel is good enough.  To paraphrase KeithBa, if God wanted to write down an int in Unicode, God would use the rules from XSD Part II.

SOAP/1.1 Section 4 or SOAP/1.2 Part I Sections 2 and 5.

Strip away the idiosyncratic encoding rules of SOAP/1.1 Section 5 and the quasi-formalisms of SOAP/1.2 Sections 3 and 4 and you get a pretty nice data model for augmenting an XML element with additional XML information that has a fighting chance of not melting down in the face of intermediaries. 

 

In terms of transitive closure, I only omitted two specs that I'm aware of:

 

XML Base

The dependency on URI brings with it the use of relative URI. Relative URI are nasty in that you can't interpret them without context, specifically, what base URI they are relative to. XML Base tells you how to figure this out (and control it) - these rules are picked up by the xs:anyURI type from XML Schema Part II.

 

OK, so that is the good news. The bad news is that XML Base was added after XML 1.0 was out the door, which means some legacy technologies don't handle it correctly (even Namespaces in XML left this undefined, which resulted in quite the brouhaha over Microsoft's use of relative URI as namespace identifiers).

 

Another relative URI nit is that because context is needed, when you sign an XML fragment that contains a relative URI, you aren't actually signing the intended value but instead are signing only the relative portion. This means that when you use XML Exclusive C14N to calculate digests, you aren't necessarily getting the whole story (in fact, this is called out in the spec as a known characteristic).

 

 

XML Information Set (Infoset)

There are actually two dependencies on the Infoset. XML Schema Part II relies on it to provide the [Base URI] property needed for xs:anyURI and the [in-scope namespaces] property for xs:QName. SOAP/1.2 relies on the Infoset for any number of reasons.

 

I deliberately did not include these two specs in my "kernel."  Specifically, the Infoset became much less useful towards the end of its development by adding things like [encoding] and [prefix] that are mired in XML 1.0 details. My favorite draft is this one from 2000. Honestly, if I had to pick a standards-based data model to build on, I'd prefer the XPath 1.0 data model over the Infoset any day. That's one reason (amongst many) why I prefer XPathNavigator in the .NET XML stack over any other API out there.

 

I also did not include any number of other technologies, including XPath, XSLT, XML Schema Part I, XML Query, SOAP/1.2 Part II, RDF, Relax NG, HTTP, HTML, XHTML, RSS, XInclude, XPointer, XLink, WSDL, UDDI, ebXML, WS-*, etc. 

 

That isn't meant to imply that any of these technologies are poor quality or useless (although there are examples of both in that list). It simply means that I think you could get a hell of a lot of work done with excellent implementations of the five specs from the kernel and not much else - that's why I think of them as a "kernel."

Escaped vs. Unescaped Markup

The Echo folks are trying to sort out a way to allow real and escaped markup to coexist. I think many of us realized that you cannot abandon escaped markup given the massive amount of classic HTML content out there. You also need to make sure that new apps don't have to dumb-down their content and trigger a second pass over the data just to pick up the markup.

 

I share the concern of Sam and others that the selection of real vs. escaped be based solely on the MIME type. This makes it difficult for schema languages like Relax NG or XSD to property process and/or validate the data.

 

My preference is that the schema type for content be roughly as follows:

 

<element name="content">

  <complexType mixed="true">

    <sequence>

      <any minOccurs="0" maxOccurs="unbounded" processContents="lax" />

    </sequence>

    <!-- attributes elided for clarity -->

  </complexType>

</element>

 

and that any escaped markup (such as HTML) appear in a pre-defined XML element whose type is xsd:string. The use of this element would indicating that the string value of the element is in fact escaped markup.

 

Honestly, given that there are no other commonly deployed markup languages that aren't based on XML, I'd be comfortable/happy calling that element <echo:rawhtml>.

 

Here's an example entry:

 

  <entry xmlns="uri/of/echo/namespace#" > 
    <title>My First Entry</title> 
    <author> 
      <name>Bob B. Bobbington</name> 
      <homepage>http://bob.name/</homepage> 
      <weblog>http://bob.blog/</weblog> 
    </author> 
    <link>http://bob.blog/28</link> 
    <id>http://bob.blog/28</id> 
 
    <created>2003-02-05T12:29:29Z</created> 
    <issued>2003-02-05T12:29:29Z</issued> 
    <modified>2003-02-05T12:29:29Z</modified> 
 
    <content type="application/xhtml+xml" xml:lang="en-us"> 
      <p xmlns="...">Hello, <em>weblog</em> world! 2 &lt; 4!</p> 
    </content> 
 
    <content type="text/plain" xml:lang="en-us" > 
      <![CDATA[ Hello, weblog world! 2 < 4 ]]> 
    </content> 
 
    <content type="text/html" xml:lang="en-us"> 
      <rawhtml><![CDATA[ <p>Hello, <em>weblog</em> world! 2 &lt; 4!</p> ]]> </rawhtml>
    </content> 
    
    <content type="image/png" xml:lang="en-us" href="http://bob.blog/helloworld.png" /> 
  </entry> 

Echo

A friend just called me at home to ask if I've been tracking the PIE work that Sam Ruby kicked off.  Honestly, I've been so heads down working on getting our product milestone off the ground that I've been completely absent from the blogosphere and will likely stay that way for the next two months or so.

 

While I'm sorry I haven't tracked what's happening, I must say I'm impressed with the Echo format that Sam and his cast of thousands have converged upon. I think

 

I hope that Sam can reign in the unbounded design churn that Wikis can foster and start making hard choices so that the world move on to building apps and stop arguing about bad use of XML or insanely insane personality wars.

 

 

 

I started a new book today

Here's the first sentence:

 

Software lives at the boundary between objective and subjective reality.

 

More to follow.

PDCs Past
Philip is mad as hell…
Stutz
My Way
Stunts
Dreams
More from Philip on Java SUVs
Old School Fun
Gudge and Tim and Me
Mystic PDC
The British are Coming
Java as SUV
Schwartz on Linux
Saturday with Longhorn
When Harry Met Pat
Subscribe to my RSSorCDF feed.


Content © 2003 Microsoft Corporation

Not powered by BlogX just yet...