WTS 2010

Workshop on Formal Methods for Web Data Trust and Security

Preliminary Program

The following talks are already programed in the workshop.

We can still arrange some additional slots, if you are willing to contribute, please contact us at WTSworkshop.

Invited Talk (1 hour)

To Protect and Share: Research Issues for Web Database Security
James Cheney

The reason XML, RDF, and other Web data standards have become so popular is that they encourage new ways of sharing and creating valuable data. While these capabilities have great benefits for availability of data, they are in fundamental tension with other traditional goals of security: privacy, confidentiality, rights management, and nonrepudiation.
In this talk, I will discuss some work (with Bravo, Fundulaki and others) on adapting classical access control techniques to XML databases, by defining security policies using XPath or schema information. We discuss some theoretical problems that arise for such policies, including consistency (that is, whether a policy allows a sequence of operations that has the same effect as a forbidden operation) and fairness (that is, whether a static enforcement mechanism is expressive enough to permit all dynamically-allowable updates). However, rigid access control policies may be difficult for users to configure and may be inimical to sharing. Many wiki developers and users believe that wikis should always be world-writable to encourage sharing. Accordingly, we consider techniques for equipping XML databases with history and provenance tracking, enabling auditing user behavior after-the-fact.

Regular Talks (30 minutes)

Detecting XML Query-Update Independence
Nicole Bidoit, Dario Colazzo, Federico Ulliana

Informally, a query and an update are independent when the query result never change after update execution, on any possible input database. Detecting query-update independence is of crucial importance in many contexts. It is crucial to: minimize view re-materialization after updates, when the view is defined by a query; ensure isolation, when queries and updates are executed concurrently; enforce access control policies, when the query is used to define the part of the database that must not be changed by a user update. Checking query-update independence in a static manner is highly desirable in all of these contexts.
In this talk, we present a novel schema-based approach for detecting XML query-update independence. The new technique relies on a schema analysis which is radically different from that of existing approaches, like a recent one presented in a VLDB paper by Benedikt and Cheney, and from those of existing type systems for XQuery like languages. In a nutshell, given an XML schema and an XPath (occurring in either the query or update) we compute a set of ordered sequences of node labels (called 'chains') that could be traversed by the path on each data instance of the schema. So we infer the type of nodes selected by query/update paths, and we also infer which labels are traversed and in which order. This is at the basis of an extremely precise independence analysis.
Our technique supports recursive DTDs, which require special care in order to avoid inferring infinite sets of chains. We show how a finite upper bound, on the size of chains to be inferred, can be determined in terms of structural properties of the query and the update. The resulting 'finite' analysis turns out to be equivalent to the 'infinite' analysis. This is reminiscent of the well known Finite Model Property technique used in the context of finite model theory.
We have implemented our chain-based technique, and performed tests on a wide class of XMark queries and updates. Test results show that the gain in terms of precision wrt the recent Benedickt and Cheney' approach is huge, while the time needed to perform static analysis remains negligible, even if the XMark DTD features complex recursive definitions, and several queries and updates involve XPath expressions using ancestor, descendant, following and preceding axes.

View update translation for XML
Iovka Boneva, Benoit Groz, Sophie Tison, Anne-Cecile Caron, Yves Roos, Slawek Staworko

We study the problem of update translation for views on XML documents. More precisely, given an XML view definition and a user defined view update program, find a source update program that translates the view update without side effects on the view. Additionally, we require for the translation to be defined on all possible source documents; this corresponds to Hegner's notion of uniform translation. The existence of such translation would allow to update XML views without need of materialisation.
The class of views we consider can remove parts of the document and rename nodes. Our update programs define the simultaneous application of a collection of atomic update operations among insertion/deletion of a subtree and node renaming. Such update programs are compatible with the XQuery Update Facility (XQUF) snapshot semantics. Both views and update programs are represented by recognizable tree languages.
Two settings for the update problem are studied: without source constraints, where all source updates are allowed, and with source constraints, where there is a restricted set of authorized source updates. Using tree automata techniques, we establish that without constraints, all view updates are uniformly translatable and the translation is tractable. In presence of constraints, not all view updates are uniformly translatable. However, we introduce a reasonable restriction on update programs for which uniform translation with constraints becomes possible. Finally, we present as a proof of concept a small fragment of the XQUF that can be expressed by our update programs, thus allows for update propagation.

WebdamExchange: A Model for Data Access on the Web
Serge Abiteboul, Alban Galland, Amelie Marian, Alkis Polyzotis

The emergence of Web 2.0 and social network applications has enabled more and more users to share sensitive information over theWeb. This strongly stresses the need for a comprehensive logical model to specify, manipulate and control information in a distributed environment.
We introduce such a model here, namely WebdamExchange. The main originality of our approach is that we articulate the problem as a general distributed knowledge-base management problem. Logical statements are used to specify data, access rights, secret keys, localization, time, provenance and knowledge about other peers, aspects that are typically considered separately. Knowledge can be communicated, replicated, queried, updated, and monitored. Relying on this knowledge base, we show how to control that information is not used improperly. The fact that the model is formally de fined allows formally proving or disproving desirable properties such as soundness (data is only acquired legally) and completeness (one can acquire all data that one can legally claim). We illustrate the use of the model with very different scenarios based on servers ranging from centralized to massively distributed, from fully trusted to untrusted, and providing encrypted or clear information. We briefly discuss a system we implemented that supports these different scenarios. The fact that they are all described in the same model opens the way for exploring situations that combine these different scenarios in arbitrarily rich ways, which is the reality of today's Web.

Random Generation of Hard Instances for the positive TAGED emptiness problem
Vincent Hugot, Olga Kouchnarenko, Pierre-Cyrille Heam

Tree automata are a widely used formalism in Computer Science. Since their creation in the fifties, numerous more expressive extensions have been proposed. Unfortunately, the decision problems associated with these extensions are quite often undecidable or in prohibitive classes of algorithmic complexity (NP-complete or worse), and little work has gone into finding efficient heuristics for them. Beyond the inherent difficulty of those problems, a common hitch in this line of research is the experimental evaluation of new algorithms. As those extensions of tree automata have remained in chiefly theoretical spheres, there are no established testbeds from the ``real world'' against which to quantify the efficiency (or lack thereof) of new algorithms. Failing that, there is a need to generate suitable testbeds at random.
Regrettably, there is little material in the literature regarding random generation of tree automata, and none at all regarding extensions such as Tree Automata with Global Equality and Disequality Constraints (TAGEDs).
It should also be noted that what little material there is does not concern itself with the interest of the generated automata wrt specific decision problems.
In this talk we present a scheme for random generation of positive TAGEDs, with a focus on making them interesting wrt the Emptiness problem.

Deciding emptiness of automata with global (dis)equality constraints
Luis Barguņķ, Carles Creus, Guillem Godoy, Florent Jacquemard, Camille Vacher

We present two new results related to the tree automata with global equality and disequality constraints (TAGED), which are tree recognizers able to test for equality and disequality between subterms whose positions are defined by the states reached during a computation.
First, we show that the subclass of TAGED with disequality constraints only is equivalent to automata computing on the dag representation of terms with maximal sharing (t-dags automata). Then, using some techniques developed by Wiltold Charatonik for the emptiness decision of t-dags automata, we propose a proof of decidability of the emptiness problem for the whole class TAGED. This is the first emptiness decision algorithm with an elementary complexity for this class automata.

Security and Access Control at the W3C
Mohamed ZERGAOUI (Innomax SARL)

In the past 18 months, there has been a lot of work around Security and Access Control especially with the come back of HTML5 and withrespect to all the mobile ecosystem and partly Geolocalisation.
I will dig into specification like [1] on which I've involved to explain how those specifications need inputs from researchers as well as tools to validate the approach retained.
[1] http://www.w3.org/TR/2009/WD-widgets-access-20091208

Rewrite Speci cations of Access Control Policies in Distributed Environments
Clara Bertolissi and Maribel Fernandez

We define a metamodel for access control that takes into account the requirements of distributed environments, where resources and access control policies may be distributed across several sites. This distributed metamodel is an extension of the category-based metamodel proposed in previous work (from which standard centralised access control models such as MAC, DAC, RBAC, Bell-Lapadula, etc. can be derived).
Having a uni ed common semantics is important in access control policies where information needs to be shared, as in many distributed applications. We use a declarative formalism in order to give an operational semantics to the distributed metamodel.
We then show how various distributed access control models can be derived as instances of the distributed metamodel, including distributed models where each site implements a different kind of local access control model.

Key Words: Security Policies, Distributed Access Control, Operational Semantics, Rewriting.