Discussion:
Fixing wave identifiers
Alex North
2010-12-17 00:14:53 UTC
Permalink
Wave identifiers as currently implemented (Wave[let]Id.java) do not conform
to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html. That
spec limits code points valid inside an identifier with an explicit goal of
supporting natural URI construction and wave references/links.

The existing code is far too relaxed in allowing just about any character in
an id, requiring lots of escaping wherever they are used and generally
causing pain. The existing escaping scheme (WaveId.serialize()) is also a
bit simplistic and doesn't help mattters (the results are still not
URL-safe).

We lagged in fixing this because the prospect of migrating existing Google
Wave data was too daunting. Apache Wave is the perfect opportunity to fix
some fundamental flaws here, before too much data is generated (yay for no
persistence yet).

I propose to change wave ids to implement the draft spec we published and
clean out lots of serialization cruft. The biggest potential roadblock to
this is that *if* a federated service generates ids that are incompatible
with the spec, those ids will not be allowed by WIAB. Since there are no
WIAB services that can persist data yet, I don't expect this to be many, but
I'm aware some services may be creating and persisting data without using
WIAB code.

The change will also change things where ids are transported or persisted.
Some examples:
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol

The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can move the
robot client library forward to use the new form, but if developers desire
it we may need to keep supporting the old serialisation just for that
protocol for a while.

Comments? Objections?
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Chris Harvey
2010-12-17 02:54:26 UTC
Permalink
Comments:

Should the question not be: Is the draft specification accepted as a
committed specification?

If 'no' then debate/refine until it is. If 'yes' then go ahead and make the
changes to the WIAB code because WIAB does not conform to the spec.; in as
smooth a way as possible, of course :)

There is a general issue here: Whilst applying a spec to the code may have
'lagged' in the old Google Wave, great care must be taken so that the same
does not happen with WIAB. We need to ensure that any proposed
change/enhancement during WIAB coding is referenced back to the appropriate
spec (and if a difference, or omission, is found then the spec must be
debated/changed/created before the code is implemented).

IMO it is imperative that spec precedes code, and not "we'll do the spec
later; we need to get the code out of the door first".

Objections:

None.
--
Chris
iotawave.org
Singapore
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Alex North
2010-12-19 23:22:29 UTC
Permalink
Post by Chris Harvey
Should the question not be: Is the draft specification accepted as a
committed specification?
If 'no' then debate/refine until it is. If 'yes' then go ahead and make the
changes to the WIAB code because WIAB does not conform to the spec.; in as
smooth a way as possible, of course :)
There is a general issue here: Whilst applying a spec to the code may have
'lagged' in the old Google Wave, great care must be taken so that the same
does not happen with WIAB. We need to ensure that any proposed
change/enhancement during WIAB coding is referenced back to the appropriate
spec (and if a difference, or omission, is found then the spec must be
debated/changed/created before the code is implemented).
IMO it is imperative that spec precedes code, and not "we'll do the spec
later; we need to get the code out of the door first".
None.
Since the answer to "is X accepted as a specification" is "no" for all X,
especially including the old wave id format, I don't think debating until
something is formally accepted somehow should be necessary yet. In my
experience it is impossible to write a working spec without implementing
something at the same time. The new id spec is a draft because we haven't
implemented it yet and I don't know if it makes sense.

I don't believe that WIAB is at a stage where we should have nailed down
specs before we can implement anything. That kind of process will put
impossible drag on development. Thus I disagree that spec should precede
code... yet! We will reach such a point for protocol-related things, but I
don't think we're there.

However, I do strongly believe design documents should precede significant
code. That draft spec, which has been available for months, functions as a
design for the ids, though I can build a more detailed justification if
people want.

A.

Note: we should discuss the issue of specs before code in another thread -
let's keep this one to ids.
Post by Chris Harvey
--
Chris
iotawave.org
Singapore
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Tad Glines
2010-12-17 03:05:41 UTC
Permalink
I think this transition would be easier if there was a defined translation
to/from the old/new ids. Then WiaB could store/use new ids, but be able to
support APIs that need to use the old ids.

-Tad
Post by Alex North
Wave identifiers as currently implemented (Wave[let]Id.java) do not conform
to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html. That
spec limits code points valid inside an identifier with an explicit goal of
supporting natural URI construction and wave references/links.
The existing code is far too relaxed in allowing just about any character
in an id, requiring lots of escaping wherever they are used and generally
causing pain. The existing escaping scheme (WaveId.serialize()) is also a
bit simplistic and doesn't help mattters (the results are still not
URL-safe).
We lagged in fixing this because the prospect of migrating existing Google
Wave data was too daunting. Apache Wave is the perfect opportunity to fix
some fundamental flaws here, before too much data is generated (yay for no
persistence yet).
I propose to change wave ids to implement the draft spec we published and
clean out lots of serialization cruft. The biggest potential roadblock to
this is that *if* a federated service generates ids that are incompatible
with the spec, those ids will not be allowed by WIAB. Since there are no
WIAB services that can persist data yet, I don't expect this to be many, but
I'm aware some services may be creating and persisting data without using
WIAB code.
The change will also change things where ids are transported or persisted.
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol
The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can move the
robot client library forward to use the new form, but if developers desire
it we may need to keep supporting the old serialisation just for that
protocol for a while.
Comments? Objections?
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Vega
2010-12-17 14:47:45 UTC
Permalink
Alex can you please describe the differences between the wave ids as
they are in Google Wave/Sandbox instances and the draft spec? Let' us
not forget that Sandbox instance is federated and has some data.
Also, i couldn't understand from the spec how blip (document) ids are
represented. In Google Wave each blip has it's own id like:
googlewave.com/w+ihNkzrFlA/~/conv+root/b+LWFcI7ZkBH9.
I am currently working on links edit/rendering and speaking in this
context it seems like currently wiab doesn't allow to implement up to
blip level links.
Post by Tad Glines
I think this transition would be easier if there was a defined translation
to/from the old/new ids. Then WiaB could store/use new ids, but be able to
support APIs that need to use the old ids.
-Tad
Post by Alex North
Wave identifiers as currently implemented (Wave[let]Id.java) do not conform
to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html. That
spec limits code points valid inside an identifier with an explicit goal of
supporting natural URI construction and wave references/links.
The existing code is far too relaxed in allowing just about any character
in an id, requiring lots of escaping wherever they are used and generally
causing pain. The existing escaping scheme (WaveId.serialize()) is also a
bit simplistic and doesn't help mattters (the results are still not
URL-safe).
We lagged in fixing this because the prospect of migrating existing Google
Wave data was too daunting. Apache Wave is the perfect opportunity to fix
some fundamental flaws here, before too much data is generated (yay for no
persistence yet).
I propose to change wave ids to implement the draft spec we published and
clean out lots of serialization cruft. The biggest potential roadblock to
this is that *if* a federated service generates ids that are incompatible
with the spec, those ids will not be allowed by WIAB. Since there are no
WIAB services that can persist data yet, I don't expect this to be many, but
I'm aware some services may be creating and persisting data without using
WIAB code.
The change will also change things where ids are transported or persisted.
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol
The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can move the
robot client library forward to use the new form, but if developers desire
it we may need to keep supporting the old serialisation just for that
protocol for a while.
Comments? Objections?
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Yuri Z. (a.k.a Vega)
2010-12-18 11:48:23 UTC
Permalink
Also, I think that full identifier should be of the form example.com/w
+abc/~conv+root/b+1234/a+abcd
where the last path section (/a+abcd) is a custom optional anchor,
this is in order to allow to reference elements not only up to blip
level, but also inside blips. The ability to uniquely reference
elements inside blip is important for the search functionality. For
example, if user searches for certain text in Google Wave - the search
result is just a list of waves that contains the specified text. It
there are 900 blips in the wave - it is of very little use to know
that 1 of them contains text you need, you still need manually to go
over the blips and look for the text.
Even if the search would take directly to the blip that contains the
text - it is still not good enough as blips can be very long. So it is
obvious that the full id should be able to reference elements inside
certain blip.
Post by Vega
Alex can you please describe the differences between the wave ids as
they are in Google Wave/Sandbox instances and the draft spec? Let' us
not forget that Sandbox instance is federated and has some data.
Also, i couldn't understand from the spec how blip (document) ids are
googlewave.com/w+ihNkzrFlA/~/conv+root/b+LWFcI7ZkBH9.
I am currently working on links edit/rendering and speaking in this
context it seems like currently wiab doesn't allow to implement up to
blip level links.
Post by Tad Glines
I think this transition would be easier if there was a defined translation
to/from the old/new ids. Then WiaB could store/use new ids, but be able to
support APIs that need to use the old ids.
-Tad
Post by Alex North
Wave identifiers as currently implemented (Wave[let]Id.java) do not conform
to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html. That
spec limits code points valid inside an identifier with an explicit goal of
supporting natural URI construction and wave references/links.
The existing code is far too relaxed in allowing just about any character
in an id, requiring lots of escaping wherever they are used and generally
causing pain. The existing escaping scheme (WaveId.serialize()) is also a
bit simplistic and doesn't help mattters (the results are still not
URL-safe).
We lagged in fixing this because the prospect of migrating existing Google
Wave data was too daunting. Apache Wave is the perfect opportunity to fix
some fundamental flaws here, before too much data is generated (yay for no
persistence yet).
I propose to change wave ids to implement the draft spec we published and
clean out lots of serialization cruft. The biggest potential roadblock to
this is that *if* a federated service generates ids that are incompatible
with the spec, those ids will not be allowed by WIAB. Since there are no
WIAB services that can persist data yet, I don't expect this to be many, but
I'm aware some services may be creating and persisting data without using
WIAB code.
The change will also change things where ids are transported or persisted.
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol
The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can move the
robot client library forward to use the new form, but if developers desire
it we may need to keep supporting the old serialisation just for that
protocol for a while.
Comments? Objections?
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Yuri Z. (a.k.a Vega)
2010-12-18 11:59:05 UTC
Permalink
To state it more clearly, the current smallest wave entity is blip,
however it's too coarse grained. The smallest wave entity should be
"anchor". So every blipUi has at least one anchor to the document
start.
Post by Yuri Z. (a.k.a Vega)
Also, I think that full identifier should be of the form example.com/w
+abc/~conv+root/b+1234/a+abcd
where the last path section (/a+abcd) is a custom optional anchor,
this is in order to allow to reference elements not only up to blip
level, but also inside blips. The ability to uniquely reference
elements inside blip is important for the search functionality. For
example, if user searches for certain text in Google Wave - the search
result is just a list of waves that contains the specified text. It
there are 900 blips in the wave - it is of very little use to know
that 1 of them contains text you need, you still need manually to go
over the blips and look for the text.
Even if the search would take directly to the blip that contains the
text - it is still not good enough as blips can be very long. So it is
obvious that the full id should be able to reference elements inside
certain blip.
Post by Vega
Alex can you please describe the differences between the wave ids as
they are in Google Wave/Sandbox instances and the draft spec? Let' us
not forget that Sandbox instance is federated and has some data.
Also, i couldn't understand from the spec how blip (document) ids are
googlewave.com/w+ihNkzrFlA/~/conv+root/b+LWFcI7ZkBH9.
I am currently working on links edit/rendering and speaking in this
context it seems like currently wiab doesn't allow to implement up to
blip level links.
Post by Tad Glines
I think this transition would be easier if there was a defined translation
to/from the old/new ids. Then WiaB could store/use new ids, but be able to
support APIs that need to use the old ids.
-Tad
Post by Alex North
Wave identifiers as currently implemented (Wave[let]Id.java) do not conform
to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html. That
spec limits code points valid inside an identifier with an explicit goal of
supporting natural URI construction and wave references/links.
The existing code is far too relaxed in allowing just about any character
in an id, requiring lots of escaping wherever they are used and generally
causing pain. The existing escaping scheme (WaveId.serialize()) is also a
bit simplistic and doesn't help mattters (the results are still not
URL-safe).
We lagged in fixing this because the prospect of migrating existing Google
Wave data was too daunting. Apache Wave is the perfect opportunity to fix
some fundamental flaws here, before too much data is generated (yay for no
persistence yet).
I propose to change wave ids to implement the draft spec we published and
clean out lots of serialization cruft. The biggest potential roadblock to
this is that *if* a federated service generates ids that are incompatible
with the spec, those ids will not be allowed by WIAB. Since there are no
WIAB services that can persist data yet, I don't expect this to be many, but
I'm aware some services may be creating and persisting data without using
WIAB code.
The change will also change things where ids are transported or persisted.
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol
The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can move the
robot client library forward to use the new form, but if developers desire
it we may need to keep supporting the old serialisation just for that
protocol for a while.
Comments? Objections?
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
STenyaK
2010-12-18 12:35:19 UTC
Permalink
I cannot agree more. Even if the common usage of waves until now has
been of many small blips, there are waves consisting of long small
blips, in which case, a link to different places inside a blip is
needed. I have little idea about the internals, so I may use wrong
concepts naming, but I think something like "blip/wave version" +
"character count" would be needed.
This way, you can link anywhere inside a blip (e.g. "right click" ->
"link to this exact place in blip"), and then "place" will get the
appropriate OTs applied if the blip changes (from the linked blip/wave
version up to latest blip/wave version). So that it keeps pointing to
the same place as the original link author intended.
Post by Yuri Z. (a.k.a Vega)
To state it more clearly, the current smallest wave entity is blip,
however it's too coarse grained. The smallest wave entity should be
"anchor". So every blipUi has at least one anchor to the document
start.
Post by Yuri Z. (a.k.a Vega)
Also, I think that full identifier should be of the form example.com/w
+abc/~conv+root/b+1234/a+abcd
where the last path section (/a+abcd) is a custom optional anchor,
this is in order to allow to reference elements not only up to blip
level, but also inside blips. The ability to uniquely reference
elements inside blip is important for the search functionality. For
example, if user searches for certain text in Google Wave - the search
result is just a list of waves that contains the specified text. It
there are 900 blips in the wave - it is of very little use to know
that 1 of them contains text you need, you still need manually to go
over the blips and look for the text.
Even if the search would take directly to the blip that contains the
text - it is still not good enough as blips can be very long. So it is
obvious that the full id should be able to reference elements inside
certain blip.
Post by Vega
Alex can you please describe the differences between the wave ids as
they are in Google Wave/Sandbox instances and the draft spec? Let' us
not forget that Sandbox instance is federated and has some data.
Also, i couldn't understand from the spec how blip (document) ids are
googlewave.com/w+ihNkzrFlA/~/conv+root/b+LWFcI7ZkBH9.
I am currently working on links edit/rendering and speaking in this
context it seems like currently wiab doesn't allow to implement up to
blip level links.
Post by Tad Glines
I think this transition would be easier if there was a defined translation
to/from the old/new ids. Then WiaB could store/use new ids, but be able to
support APIs that need to use the old ids.
-Tad
Post by Alex North
Wave identifiers as currently implemented (Wave[let]Id.java) do not conform
to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html. That
spec limits code points valid inside an identifier with an explicit goal of
supporting natural URI construction and wave references/links.
The existing code is far too relaxed in allowing just about any character
in an id, requiring lots of escaping wherever they are used and generally
causing pain. The existing escaping scheme (WaveId.serialize()) is also a
bit simplistic and doesn't help mattters (the results are still not
URL-safe).
We lagged in fixing this because the prospect of migrating existing Google
Wave data was too daunting. Apache Wave is the perfect opportunity to fix
some fundamental flaws here, before too much data is generated (yay for no
persistence yet).
I propose to change wave ids to implement the draft spec we published and
clean out lots of serialization cruft. The biggest potential roadblock to
this is that *if* a federated service generates ids that are incompatible
with the spec, those ids will not be allowed by WIAB. Since there are no
WIAB services that can persist data yet, I don't expect this to be many, but
I'm aware some services may be creating and persisting data without using
WIAB code.
The change will also change things where ids are transported or persisted.
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol
The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can move the
robot client library forward to use the new form, but if developers desire
it we may need to keep supporting the old serialisation just for that
protocol for a while.
Comments? Objections?
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
--
Saludos,
     Bruno González

_______________________________________________
Jabber: stenyak AT gmail.com
http://www.stenyak.com
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Alex North
2010-12-19 23:08:56 UTC
Permalink
Hey Yuri - I agree in general. The new URI-format is flexible for those kind
of extensions, and I've had some idea about how to implement them already.
The WaveRef at the moment only goes to the blip level just because that's
the only level we got round to using.
Post by Yuri Z. (a.k.a Vega)
To state it more clearly, the current smallest wave entity is blip,
however it's too coarse grained. The smallest wave entity should be
"anchor". So every blipUi has at least one anchor to the document
start.
Post by Yuri Z. (a.k.a Vega)
Also, I think that full identifier should be of the form example.com/w
+abc/~conv+root/b+1234/a+abcd
where the last path section (/a+abcd) is a custom optional anchor,
this is in order to allow to reference elements not only up to blip
level, but also inside blips. The ability to uniquely reference
elements inside blip is important for the search functionality. For
example, if user searches for certain text in Google Wave - the search
result is just a list of waves that contains the specified text. It
there are 900 blips in the wave - it is of very little use to know
that 1 of them contains text you need, you still need manually to go
over the blips and look for the text.
Even if the search would take directly to the blip that contains the
text - it is still not good enough as blips can be very long. So it is
obvious that the full id should be able to reference elements inside
certain blip.
Post by Vega
Alex can you please describe the differences between the wave ids as
they are in Google Wave/Sandbox instances and the draft spec? Let' us
not forget that Sandbox instance is federated and has some data.
Also, i couldn't understand from the spec how blip (document) ids are
googlewave.com/w+ihNkzrFlA/~/conv+root/b+LWFcI7ZkBH9.
I am currently working on links edit/rendering and speaking in this
context it seems like currently wiab doesn't allow to implement up to
blip level links.
Post by Tad Glines
I think this transition would be easier if there was a defined
translation
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
to/from the old/new ids. Then WiaB could store/use new ids, but be
able to
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
support APIs that need to use the old ids.
-Tad
Post by Alex North
Wave identifiers as currently implemented (Wave[let]Id.java) do not
conform
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html.
That
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
spec limits code points valid inside an identifier with an explicit
goal of
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
supporting natural URI construction and wave references/links.
The existing code is far too relaxed in allowing just about any
character
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
in an id, requiring lots of escaping wherever they are used and
generally
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
causing pain. The existing escaping scheme (WaveId.serialize()) is
also a
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
bit simplistic and doesn't help mattters (the results are still not
URL-safe).
We lagged in fixing this because the prospect of migrating existing
Google
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
Wave data was too daunting. Apache Wave is the perfect opportunity
to fix
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
some fundamental flaws here, before too much data is generated (yay
for no
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
persistence yet).
I propose to change wave ids to implement the draft spec we
published and
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
clean out lots of serialization cruft. The biggest potential
roadblock to
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
this is that *if* a federated service generates ids that are
incompatible
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
with the spec, those ids will not be allowed by WIAB. Since there
are no
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
WIAB services that can persist data yet, I don't expect this to be
many, but
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
I'm aware some services may be creating and persisting data without
using
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
WIAB code.
The change will also change things where ids are transported or
persisted.
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol
The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can
move the
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
robot client library forward to use the new form, but if developers
desire
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
it we may need to keep supporting the old serialisation just for
that
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
protocol for a while.
Comments? Objections?
--
You received this message because you are subscribed to the Google
Groups
Post by Yuri Z. (a.k.a Vega)
Post by Vega
Post by Tad Glines
Post by Alex North
"Wave Protocol" group.
To post to this group, send email to
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Alex North
2010-12-19 23:14:18 UTC
Permalink
Post by Tad Glines
I think this transition would be easier if there was a defined translation
to/from the old/new ids. Then WiaB could store/use new ids, but be able to
support APIs that need to use the old ids.
There's two different things going here. One is translating between id
serializations to talk to older APIs. I definitely agree and will implement
that.

The second is dealing with existing data that uses ids containing characters
that are not valid in the new spec (e.g. '/' inside a token). Because WIAB
has not had persistence, there is no existing data there. It's easy to
define a one-way mapping so that for importing data from Google Wave or
other systems, WIAB can find a new id for broken ones (and update all
references to it on the way). But a two-way mapping is going to be harder. I
don't think it's worthwhile unless someone screams out that making ids
containing '/' invalid (or anything else in the draft spec) is going to
break them.

A.
Post by Tad Glines
-Tad
Post by Alex North
Wave identifiers as currently implemented (Wave[let]Id.java) do not
conform to the draft specification we published at
http://wave-protocol.googlecode.com/hg/spec/waveid/waveidspec.html. That
spec limits code points valid inside an identifier with an explicit goal of
supporting natural URI construction and wave references/links.
The existing code is far too relaxed in allowing just about any character
in an id, requiring lots of escaping wherever they are used and generally
causing pain. The existing escaping scheme (WaveId.serialize()) is also a
bit simplistic and doesn't help mattters (the results are still not
URL-safe).
We lagged in fixing this because the prospect of migrating existing Google
Wave data was too daunting. Apache Wave is the perfect opportunity to fix
some fundamental flaws here, before too much data is generated (yay for no
persistence yet).
I propose to change wave ids to implement the draft spec we published and
clean out lots of serialization cruft. The biggest potential roadblock to
this is that *if* a federated service generates ids that are incompatible
with the spec, those ids will not be allowed by WIAB. Since there are no
WIAB services that can persist data yet, I don't expect this to be many, but
I'm aware some services may be creating and persisting data without using
WIAB code.
The change will also change things where ids are transported or persisted.
- user-data wavelet
- wave links
- URL bar history hash
- c/s protocol
- robot protocol
The robot protocol is an interesting case, because changing the id
serialisation to be a URI is backwards-incompatible. I hope we can move the
robot client library forward to use the new form, but if developers desire
it we may need to keep supporting the old serialisation just for that
protocol for a while.
Comments? Objections?
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Tad Glines
2010-12-20 02:47:39 UTC
Permalink
Post by Alex North
Post by Tad Glines
I think this transition would be easier if there was a defined translation
to/from the old/new ids. Then WiaB could store/use new ids, but be able to
support APIs that need to use the old ids.
There's two different things going here. One is translating between id
serializations to talk to older APIs. I definitely agree and will implement
that.
The second is dealing with existing data that uses ids containing
characters that are not valid in the new spec (e.g. '/' inside a token).
Because WIAB has not had persistence, there is no existing data there. It's
easy to define a one-way mapping so that for importing data from Google Wave
or other systems, WIAB can find a new id for broken ones (and update all
references to it on the way). But a two-way mapping is going to be harder. I
don't think it's worthwhile unless someone screams out that making ids
containing '/' invalid (or anything else in the draft spec) is going to
break them.
Why not just URI encode (or similar) the bits that are not allowed in the
new spec? Since there is no defined translation from old to new, why not
define a translation that ensure that illegal characters in the old spec get
translated (in a reversible way) a set of legal characters in the new spec.

-Tad
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Alex North
2010-12-20 03:32:19 UTC
Permalink
Post by Tad Glines
Post by Alex North
Post by Tad Glines
I think this transition would be easier if there was a defined
translation to/from the old/new ids. Then WiaB could store/use new ids, but
be able to support APIs that need to use the old ids.
There's two different things going here. One is translating between id
serializations to talk to older APIs. I definitely agree and will implement
that.
The second is dealing with existing data that uses ids containing
characters that are not valid in the new spec (e.g. '/' inside a token).
Because WIAB has not had persistence, there is no existing data there. It's
easy to define a one-way mapping so that for importing data from Google Wave
or other systems, WIAB can find a new id for broken ones (and update all
references to it on the way). But a two-way mapping is going to be harder. I
don't think it's worthwhile unless someone screams out that making ids
containing '/' invalid (or anything else in the draft spec) is going to
break them.
Why not just URI encode (or similar) the bits that are not allowed in the
new spec?
The new spec is explicitly constructed to have wave links (URI forms of wave
ids) not be full of ugly encoding. If characters in tokens making up an id
are %-encoded already then they get double %-encoded when the id is
expressed as a URI. % is a disallowed character for this reason. Some other
encoding would be possible, sure, but hella ugly. I think it would be
complex to identify when this encoding would be necessary; modern systems
interoperating would never need it.
Post by Tad Glines
Since there is no defined translation from old to new, why not define a
translation that ensure that illegal characters in the old spec get
translated (in a reversible way) a set of legal characters in the new spec.
I agree that such a translation is possible in principle, but I believe it
will be complex. The only case where I can actually see it being useful is
for federated systems with existing data. These will be thwarted by
signatures anyway. And I don't know of any that will actually have a problem
(ids that aren't automatically valid are unlikely).
Post by Tad Glines
-Tad
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Alex North
2011-01-07 00:18:50 UTC
Permalink
So I don't think I actually convinced many people of the value of change
here, but neither were there any outcries about anything that might actually
break.

I've been making numerous improvements to the use of ids inside WIAB, and if
you've been following the patches this might have been educational. But I'm
now up to the point of changing the format of the data that goes on disk and
over the wire. I'm willing to do this work.

For the robot protocol I definitely agree with Tad that we should keep
interoperating with robots using the old serialisation. I propose that we
continue serialising to the same format as robots currently expect, but
restrict ids to those conforming to the new specification. The transport
serialisation can then change to the URI form later. Since WIAB doesn't yet
persist data, robots interacting with WAIB can't have any stored data that
will be made invalid.

For federated deltas, and the data embedded in those deltas (like in the
conversation model and user-data wavelet) I also propose to both restrict
valid identifiers to the charsets in the draft spec and serialise them in
the straightfoward draft-spec way rather than the undocumented and awkward
current method. In this case I don't think there is a compatibility option;
the serialised ids are part of the signed delta. Again, however, WIAB
doesn't persist data and no-one shouted out with things that will break
(wavesandbox is a sandbox and not maintained, demo.waveinabox.org is a
better integration test) - I think now is the time to make this
simplification.

Please let me know if you disagree so we can figure out a way forward.

Alex
Post by Alex North
Post by Tad Glines
Post by Alex North
Post by Tad Glines
I think this transition would be easier if there was a defined
translation to/from the old/new ids. Then WiaB could store/use new ids, but
be able to support APIs that need to use the old ids.
There's two different things going here. One is translating between id
serializations to talk to older APIs. I definitely agree and will implement
that.
The second is dealing with existing data that uses ids containing
characters that are not valid in the new spec (e.g. '/' inside a token).
Because WIAB has not had persistence, there is no existing data there. It's
easy to define a one-way mapping so that for importing data from Google Wave
or other systems, WIAB can find a new id for broken ones (and update all
references to it on the way). But a two-way mapping is going to be harder. I
don't think it's worthwhile unless someone screams out that making ids
containing '/' invalid (or anything else in the draft spec) is going to
break them.
Why not just URI encode (or similar) the bits that are not allowed in the
new spec?
The new spec is explicitly constructed to have wave links (URI forms of
wave ids) not be full of ugly encoding. If characters in tokens making up an
id are %-encoded already then they get double %-encoded when the id is
expressed as a URI. % is a disallowed character for this reason. Some other
encoding would be possible, sure, but hella ugly. I think it would be
complex to identify when this encoding would be necessary; modern systems
interoperating would never need it.
Post by Tad Glines
Since there is no defined translation from old to new, why not define a
translation that ensure that illegal characters in the old spec get
translated (in a reversible way) a set of legal characters in the new spec.
I agree that such a translation is possible in principle, but I believe it
will be complex. The only case where I can actually see it being useful is
for federated systems with existing data. These will be thwarted by
signatures anyway. And I don't know of any that will actually have a problem
(ids that aren't automatically valid are unlikely).
Post by Tad Glines
-Tad
--
You received this message because you are subscribed to the Google Groups
"Wave Protocol" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/group/wave-protocol?hl=en.
--
You received this message because you are subscribed to the Google Groups "Wave Protocol" group.
To post to this group, send email to wave-protocol-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to wave-protocol+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
Loading...