pyRdfa.state

Parser's execution context (a.k.a. state) object and handling. The state includes:

  • language, retrieved from C{@xml:lang} or C{@lang}
  • URI base, determined by C{} or set explicitly. This is a little bit superfluous, because the current RDFa syntax does not make use of C{@xml:base}; i.e., this could be a global value. But the structure is prepared to add C{@xml:base} easily, if needed.
  • options, in the form of an L{options<pyRdfa.options>} instance
  • a separate vocabulary/CURIE handling resource, in the form of an L{termorcurie} instance

The execution context object is also used to handle URI-s, CURIE-s, terms, etc.

@summary: RDFa parser execution context @organization: U{World Wide Web Consortiumhttp://www.w3.org} @author: U{Ivan Herman} @license: This software is available for use under the U{W3C® SOFTWARE NOTICE AND LICENSE}

  1# -*- coding: utf-8 -*-
  2"""
  3Parser's execution context (a.k.a. state) object and handling. The state includes:
  4
  5  - language, retrieved from C{@xml:lang} or C{@lang}
  6  - URI base, determined by C{<base>} or set explicitly. This is a little bit superfluous, because the current RDFa syntax does not make use of C{@xml:base}; i.e., this could be a global value.  But the structure is prepared to add C{@xml:base} easily, if needed.
  7  - options, in the form of an L{options<pyRdfa.options>} instance
  8  - a separate vocabulary/CURIE handling resource, in the form of an L{termorcurie<pyRdfa.TermOrCurie>} instance
  9
 10The execution context object is also used to handle URI-s, CURIE-s, terms, etc.
 11
 12@summary: RDFa parser execution context
 13@organization: U{World Wide Web Consortium<http://www.w3.org>}
 14@author: U{Ivan Herman<a href="http://www.w3.org/People/Ivan/">}
 15@license: This software is available for use under the
 16U{W3C® SOFTWARE NOTICE AND LICENSE<href="http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231">}
 17"""
 18
 19"""
 20$Id: state.py,v 1.23 2013-10-16 11:48:54 ivan Exp $
 21$Date: 2013-10-16 11:48:54 $
 22"""
 23
 24from rdflib import URIRef
 25from rdflib import BNode
 26
 27from .host import HostLanguage, accept_xml_base, accept_xml_lang, beautifying_prefixes
 28
 29from .termorcurie import TermOrCurie
 30from . import UnresolvablePrefix, UnresolvableTerm
 31
 32from . import err_URI_scheme
 33from . import err_illegal_safe_CURIE
 34from . import err_no_CURIE_in_safe_CURIE
 35from . import err_undefined_terms
 36from . import err_non_legal_CURIE_ref
 37from . import err_undefined_CURIE
 38
 39from urllib.parse import urlparse, urlunparse, urlsplit, urljoin
 40
 41class ListStructure:
 42    """Special class to handle the C{@inlist} type structures in RDFa 1.1; stores the "origin", i.e,
 43    where the list will be attached to, and the mappings as defined in the spec.
 44    """
 45    def __init__(self):
 46        self.mapping = {}
 47        self.origin      = None
 48
 49#### Core Class definition
 50class ExecutionContext:
 51    """State at a specific node, including the current set of namespaces in the RDFLib sense, current language,
 52    the base, vocabularies, etc. The class is also used to interpret URI-s and CURIE-s to produce
 53    URI references for RDFLib.
 54    
 55    @ivar options: reference to the overall options
 56    @type options: L{Options}
 57    @ivar base: the 'base' URI
 58    @ivar parsedBase: the parsed version of base, as produced by urlparse.urlsplit
 59    @ivar defaultNS: default namespace (if defined via @xmlns) to be used for XML Literals
 60    @ivar lang: language tag (possibly None)
 61    @ivar term_or_curie: vocabulary management class instance
 62    @type term_or_curie: L{termorcurie.TermOrCurie}
 63    @ivar list_mapping: dictionary of arrays, containing a list of URIs key-ed via properties for lists
 64    @ivar node: the node to which this state belongs
 65    @type node: DOM node instance
 66    @ivar rdfa_version: RDFa version of the content
 67    @type rdfa_version: String
 68    @ivar supress_lang: in some cases, the effect of the lang attribute should be supressed for the given node, although it should be inherited down below (example: @value attribute of the data element in HTML5)
 69    @type supress_lang: Boolean
 70    @cvar _list: list of attributes that allow for lists of values and should be treated as such
 71    @cvar _resource_type: dictionary; mapping table from attribute name to the exact method to retrieve the URI(s). Is initialized at first instantiation.
 72    """
 73
 74    # list of attributes that allow for lists of values and should be treated as such    
 75    _list = [ "rel", "rev", "property", "typeof", "role" ]
 76    # mapping table from attribute name to the exact method to retrieve the URI(s).
 77    _resource_type = {}
 78
 79    def __init__(self, node, graph, inherited_state=None, base="", options=None, rdfa_version=None):
 80        """
 81        @param node: the current DOM Node
 82        @param graph: the RDFLib Graph
 83        @keyword inherited_state: the state as inherited
 84        from upper layers. This inherited_state is mixed with the state information
 85        retrieved from the current node.
 86        @type inherited_state: L{state.ExecutionContext}
 87        @keyword base: string denoting the base URI for the specific node. This overrides the possible
 88        base inherited from the upper layers. The 
 89        current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is
 90        necessary for SVG (and other possible XML dialects that accept C{@xml:base})
 91        @keyword options: invocation options, and references to warning graphs
 92        @type options: L{Options<pyRdfa.options>}
 93        """
 94        def remove_frag_id(uri):
 95            """
 96            The fragment ID for self.base must be removed
 97            """
 98            try:
 99                # To be on the safe side:-)
100                t = urlparse(uri)
101                return urlunparse((t[0],t[1],t[2],t[3],t[4],""))
102            except:
103                return uri
104
105        # This is, conceptually, an additional class initialization, but it must be done run time, otherwise import errors show up
106        if len(ExecutionContext._resource_type) == 0 :
107            ExecutionContext._resource_type = {
108                "href"        :    ExecutionContext._URI,
109                "src"        :    ExecutionContext._URI,
110                "vocab"        :   ExecutionContext._URI,
111
112                "about"        :    ExecutionContext._CURIEorURI, 
113                "resource"    :    ExecutionContext._CURIEorURI, 
114
115                "rel"        :    ExecutionContext._TERMorCURIEorAbsURI,
116                "rev"        :    ExecutionContext._TERMorCURIEorAbsURI,
117                "datatype"    :    ExecutionContext._TERMorCURIEorAbsURI,
118                "typeof"    :    ExecutionContext._TERMorCURIEorAbsURI,
119                "property"    :    ExecutionContext._TERMorCURIEorAbsURI,
120                "role"        :    ExecutionContext._TERMorCURIEorAbsURI,
121            }
122        #-----------------------------------------------------------------
123        self.node = node
124        
125        #-----------------------------------------------------------------
126        # Settling the base. In a generic XML, xml:base should be accepted at all levels (though this is not the
127        # case in, say, XHTML...)
128        # At the moment, it is invoked with a 'None' at the top level of parsing, that is
129        # when the <base> element is looked for (for the HTML cases, that is)
130        if inherited_state:
131            self.rdfa_version = inherited_state.rdfa_version
132            self.base = inherited_state.base
133            self.options = inherited_state.options
134
135            self.list_mapping = inherited_state.list_mapping
136            self.new_list = False
137            
138            # for generic XML versions the xml:base attribute should be handled
139            if self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"):
140                self.base = remove_frag_id(node.getAttribute("xml:base"))
141        else:
142            # this is the branch called from the very top            
143            self.list_mapping = ListStructure()
144            self.new_list = True
145
146            if rdfa_version is not None:
147                self.rdfa_version = rdfa_version
148            else:
149                from . import rdfa_current_version                
150                self.rdfa_version = rdfa_current_version
151
152            # This value can be overwritten by a @version attribute
153            if node.hasAttribute("version"):
154                top_version = node.getAttribute("version")
155                if top_version.find("RDFa 1.0") != -1 or top_version.find("RDFa1.0") != -1:
156                    self.rdfa_version = "1.0"
157                elif top_version.find("RDFa 1.1") != -1 or top_version.find("RDFa1.1") != -1:
158                    self.rdfa_version = "1.1"                        
159
160            # this is just to play safe. I believe this should actually not happen...
161            if options == None:
162                from . import Options
163                self.options = Options()
164            else:
165                self.options = options
166
167            self.base = ""
168            # handle the base element case for HTML
169            if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.html5, HostLanguage.xhtml5  ]:
170                for bases in node.getElementsByTagName("base"):
171                    if bases.hasAttribute("href"):
172                        self.base = remove_frag_id(bases.getAttribute("href"))
173                        continue
174            elif self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"):
175                self.base = remove_frag_id(node.getAttribute("xml:base"))
176                
177            # If no local setting for base occurs, the input argument has it
178            if self.base == "":
179                self.base = base
180                
181            # Perform an extra beautification in RDFLib
182            if self.options.host_language in beautifying_prefixes:
183                values = beautifying_prefixes[self.options.host_language]
184                for key in values:
185                    graph.bind(key, values[key])
186
187            input_info = "Input Host Language:%s, RDFa version:%s, base:%s" % (self.options.host_language, self.rdfa_version, self.base)
188            self.options.add_info(input_info)
189
190        #-----------------------------------------------------------------
191        # this will be used repeatedly, better store it once and for all...        
192        self.parsedBase = urlsplit(self.base)
193
194        #-----------------------------------------------------------------
195        # generate and store the local CURIE handling class instance
196        self.term_or_curie = TermOrCurie(self, graph, inherited_state)
197
198        #-----------------------------------------------------------------
199        # Settling the language tags
200        # @lang has priority over @xml:lang
201        # it is a bit messy: the three fundamental modes (xhtml, html, or xml) are all slightly different:-(
202        # first get the inherited state's language, if any
203        if inherited_state:
204            self.lang = inherited_state.lang
205        else:
206            self.lang = None
207            
208        self.supress_lang = False
209            
210            
211        if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.xhtml5, HostLanguage.html5 ]:
212            # we may have lang and xml:lang
213            if node.hasAttribute("lang"):
214                lang = node.getAttribute("lang").lower()
215            else:
216                lang = None
217            if node.hasAttribute("xml:lang"):
218                xmllang = node.getAttribute("xml:lang").lower()
219            else:
220                xmllang = None
221            # First of all, set the value, if any
222            if xmllang != None:
223                # this has priority
224                if len(xmllang) != 0:
225                    self.lang = xmllang
226                else:
227                    self.lang = None
228            elif lang != None:
229                if len(lang) != 0:
230                    self.lang = lang
231                else:
232                    self.lang = None
233            # Ideally, a warning should be generated if lang and xmllang are both present with different values. But
234            # the HTML5 Parser does its magic by overriding a lang value if xmllang is present, so the potential
235            # error situations are simply swallowed...
236                
237        elif self.options.host_language in accept_xml_lang and node.hasAttribute("xml:lang"):
238                self.lang = node.getAttribute("xml:lang").lower()
239                if len(self.lang) == 0:
240                    self.lang = None
241            
242        #-----------------------------------------------------------------
243        # Set the default namespace. Used when generating XML Literals
244        if node.hasAttribute("xmlns"):
245            self.defaultNS = node.getAttribute("xmlns")
246        elif inherited_state and inherited_state.defaultNS != None:
247            self.defaultNS = inherited_state.defaultNS
248        else:
249            self.defaultNS = None
250    # end __init__
251
252    def _URI(self, val):
253        """Returns a URI for a 'pure' URI (ie, not a CURIE). The method resolves possible relative URI-s. It also
254        checks whether the URI uses an unusual URI scheme (and issues a warning); this may be the result of an
255        uninterpreted CURIE...
256        @param val: attribute value to be interpreted
257        @type val: string
258        @return: an RDFLib URIRef instance
259        """
260        def create_URIRef(uri, check=True):
261            """
262            Mini helping function: it checks whether a uri is using a usual scheme before a URIRef is created. In case
263            there is something unusual, a warning is generated (though the URIRef is created nevertheless)
264            @param uri: (absolute) URI string
265            @return: an RDFLib URIRef instance
266            """
267            from .    import uri_schemes
268            val = uri.strip()
269            if check and urlsplit(val)[0] not in uri_schemes:
270                self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName)
271            return URIRef(val)
272
273        def join(base, v, check=True):
274            """
275            Mini helping function: it makes a urljoin for the paths. Based on the python library, but
276            that one has a bug: in some cases it
277            swallows the '#' or '?' character at the end. This is clearly a problem with
278            Semantic Web URI-s, so this is checked, too
279            @param base: base URI string
280            @param v: local part
281            @param check: whether the URI should be checked against the list of 'existing' URI schemes
282            @return: an RDFLib URIRef instance
283            """
284            
285            joined = urljoin(base, v)
286            try:
287                if v[-1] != joined[-1] and (v[-1] == "#" or v[-1] == "?"):
288                    return create_URIRef(joined + v[-1], check)
289                else:
290                    return create_URIRef(joined, check)
291            except:
292                return create_URIRef(joined, check)
293
294        if val == "":
295            # The fragment ID must be removed...
296            return URIRef(self.base)
297            
298        # fall back on good old traditional URI-s.
299        # To be on the safe side, let us use the Python libraries
300        if self.parsedBase[0] == "":
301            # base is, in fact, a local file name
302            # The following call is just to be sure that some pathological cases when
303            # the ':' _does_ appear in the URI but not in a scheme position is taken
304            # care of properly...
305            
306            key = urlsplit(val)[0]
307            if key == "":
308                # relative URI, to be combined with local file name:
309                return join(self.base, val, check = False)
310            else:
311                return create_URIRef(val)
312        else:
313            # Trust the python library...
314            # Well, not quite:-) there is what is, in my view, a bug in the urljoin; in some cases it
315            # swallows the '#' or '?' character at the end. This is clearly a problem with
316            # Semantic Web URI-s            
317            return join(self.base, val)
318    # end _URI
319
320    def _CURIEorURI(self, val):
321        """Returns a URI for a (safe or not safe) CURIE. In case it is a safe CURIE but the CURIE itself
322        is not defined, an error message is issued. Otherwise, if it is not a CURIE, it is taken to be a URI
323        @param val: attribute value to be interpreted
324        @type val: string
325        @return: an RDFLib URIRef instance or None
326        """
327        if val == "":
328            return URIRef(self.base)
329
330        safe_curie = False
331        if val[0] == '[':
332            # If a safe CURIE is asked for, a pure URI is not acceptable.
333            # Is checked below, and that is why the safe_curie flag is necessary
334            if val[-1] != ']':
335                # that is certainly forbidden: an incomplete safe CURIE
336                self.options.add_warning(err_illegal_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName)
337                return None
338            else:
339                val = val[1:-1]
340                safe_curie = True
341        # There is a branch here depending on whether we are in 1.1 or 1.0 mode
342        if self.rdfa_version >= "1.1":
343            retval = self.term_or_curie.CURIE_to_URI(val)
344            if retval == None:
345                # the value could not be interpreted as a CURIE, ie, it did not produce any valid URI.
346                # The rule says that then the whole value should be considered as a URI
347                # except if it was part of a safe CURIE. In that case it should be ignored...
348                if safe_curie:
349                    self.options.add_warning(err_no_CURIE_in_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName)
350                    return None
351                else:
352                    return self._URI(val)
353            else:
354                # there is an unlikely case where the retval is actually a URIRef with a relative URI. Better filter that one out
355                if isinstance(retval, BNode) == False and urlsplit(str(retval))[0] == "":
356                    # yep, there is something wrong, a new URIRef has to be created:
357                    return URIRef(self.base+str(retval))
358                else:
359                    return retval
360        else:
361            # in 1.0 mode a CURIE can be considered only in case of a safe CURIE
362            if safe_curie:
363                return self.term_or_curie.CURIE_to_URI(val)
364            else:
365                return self._URI(val)
366    # end _CURIEorURI
367
368    def _TERMorCURIEorAbsURI(self, val):
369        """Returns a URI either for a term or for a CURIE. The value must be an NCNAME to be handled as a term; otherwise
370        the method falls back on a CURIE or an absolute URI.
371        @param val: attribute value to be interpreted
372        @type val: string
373        @return: an RDFLib URIRef instance or None
374        """
375        from . import uri_schemes
376        # This case excludes the pure base, ie, the empty value
377        if val == "":
378            return None
379        
380        from .termorcurie import termname
381        if termname.match(val):
382            # This is a term, must be handled as such...            
383            retval = self.term_or_curie.term_to_URI(val)
384            if not retval:
385                self.options.add_warning(err_undefined_terms % val, UnresolvableTerm, node=self.node.nodeName, buggy_value = val)
386                return None
387            else:
388                return retval
389        else:
390            # try a CURIE
391            retval = self.term_or_curie.CURIE_to_URI(val)
392            if retval:
393                return retval
394            elif self.rdfa_version >= "1.1":
395                # See if it is an absolute URI
396                scheme = urlsplit(val)[0]
397                if scheme == "":
398                    # bug; there should be no relative URIs here
399                    self.options.add_warning(err_non_legal_CURIE_ref % val, UnresolvablePrefix, node=self.node.nodeName)
400                    return None
401                else:
402                    if scheme not in uri_schemes:
403                        self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName)
404                    return URIRef(val)
405            else:
406                # rdfa 1.0 case
407                self.options.add_warning(err_undefined_CURIE % val.strip(), UnresolvablePrefix, node=self.node.nodeName)
408                return None
409    # end _TERMorCURIEorAbsURI
410
411    # -----------------------------------------------------------------------------------------------
412
413    def getURI(self, attr):
414        """Get the URI(s) for the attribute. The name of the attribute determines whether the value should be
415        a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done
416        using the L{ExecutionContext._resource_type} table.
417        @param attr: attribute name
418        @type attr: string
419        @return: an RDFLib URIRef instance (or None) or a list of those
420        """
421        if self.node.hasAttribute(attr):
422            val = self.node.getAttribute(attr)
423        else:
424            if attr in ExecutionContext._list:
425                return []
426            else:
427                return None
428        
429        # This may raise an exception if the attr has no key. This, actually,
430        # should not happen if the code is correct, but it does not harm having it here...
431        try:
432            func = ExecutionContext._resource_type[attr]
433        except:
434            # Actually, this should not happen...
435            func = ExecutionContext._URI
436        
437        if attr in ExecutionContext._list:
438            # Allows for a list
439            resources = [ func(self, v.strip()) for v in val.strip().split() if v != None ]
440            retval = [ r for r in resources if r != None ]
441        else:
442            retval = func(self, val.strip())
443        return retval
444    # end getURI
445    
446    def getResource(self, *args):
447        """Get single resources from several different attributes. The first one that returns a valid URI wins.
448        @param args: variable list of attribute names, or a single attribute being a list itself.
449        @return: an RDFLib URIRef instance (or None):
450        """
451        if len(args) == 0:
452            return None
453        if isinstance(args[0], tuple) or isinstance(args[0], list):
454            rargs = args[0]
455        else:
456            rargs = args
457            
458        for resource in rargs:
459            uri = self.getURI(resource)
460            if uri != None : return uri
461        return None
462    
463    # -----------------------------------------------------------------------------------------------
464    def reset_list_mapping(self, origin=None):
465        """
466        Reset, ie, create a new empty dictionary for the list mapping.
467        """
468        self.list_mapping = ListStructure()
469        if origin: self.set_list_origin(origin)
470        self.new_list = True
471
472    def list_empty(self):
473        """
474        Checks whether the list is empty.
475        @return: Boolean
476        """
477        return len(self.list_mapping.mapping) == 0
478        
479    def get_list_props(self):
480        """
481        Return the list of property values in the list structure
482        @return: list of URIRef
483        """
484        return list(self.list_mapping.mapping.keys())
485        
486    def get_list_value(self,prop):
487        """
488        Return the list of values in the list structure for a specific property
489        @return: list of RDF nodes
490        """
491        return self.list_mapping.mapping[prop]
492        
493    def set_list_origin(self, origin):
494        """
495        Set the origin of the list, ie, the subject to attach the final list(s) to
496        @param origin: URIRef
497        """        
498        self.list_mapping.origin = origin
499        
500    def get_list_origin(self):
501        """
502        Return the origin of the list, ie, the subject to attach the final list(s) to
503        @return: URIRef
504        """        
505        return self.list_mapping.origin
506        
507    def add_to_list_mapping(self, prop, resource):
508        """Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays;
509        if the array does not exist yet, it will be created on the fly.
510        
511        @param prop: the property URI, used as a key in the dictionary
512        @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy
513        placeholder for C{<span rel="property" inlist>...</span>} constructions that may be filled in by children or siblings; if not
514        an empty list has to be generated.
515        """
516        if prop in self.list_mapping.mapping:
517            if resource != None:
518                # indeed, if it is None, than it should not override anything
519                if self.list_mapping.mapping[prop] == None:
520                    # replacing a dummy with real content
521                    self.list_mapping.mapping[prop] = [ resource ]
522                else :            
523                    self.list_mapping.mapping[prop].append(resource)
524        else:
525            if resource != None:
526                self.list_mapping.mapping[prop] = [ resource ]
527            else:
528                self.list_mapping.mapping[prop] = None
529
530####################
class ListStructure:
42class ListStructure:
43    """Special class to handle the C{@inlist} type structures in RDFa 1.1; stores the "origin", i.e,
44    where the list will be attached to, and the mappings as defined in the spec.
45    """
46    def __init__(self):
47        self.mapping = {}
48        self.origin      = None

Special class to handle the C{@inlist} type structures in RDFa 1.1; stores the "origin", i.e, where the list will be attached to, and the mappings as defined in the spec.

mapping
origin
class ExecutionContext:
 51class ExecutionContext:
 52    """State at a specific node, including the current set of namespaces in the RDFLib sense, current language,
 53    the base, vocabularies, etc. The class is also used to interpret URI-s and CURIE-s to produce
 54    URI references for RDFLib.
 55    
 56    @ivar options: reference to the overall options
 57    @type options: L{Options}
 58    @ivar base: the 'base' URI
 59    @ivar parsedBase: the parsed version of base, as produced by urlparse.urlsplit
 60    @ivar defaultNS: default namespace (if defined via @xmlns) to be used for XML Literals
 61    @ivar lang: language tag (possibly None)
 62    @ivar term_or_curie: vocabulary management class instance
 63    @type term_or_curie: L{termorcurie.TermOrCurie}
 64    @ivar list_mapping: dictionary of arrays, containing a list of URIs key-ed via properties for lists
 65    @ivar node: the node to which this state belongs
 66    @type node: DOM node instance
 67    @ivar rdfa_version: RDFa version of the content
 68    @type rdfa_version: String
 69    @ivar supress_lang: in some cases, the effect of the lang attribute should be supressed for the given node, although it should be inherited down below (example: @value attribute of the data element in HTML5)
 70    @type supress_lang: Boolean
 71    @cvar _list: list of attributes that allow for lists of values and should be treated as such
 72    @cvar _resource_type: dictionary; mapping table from attribute name to the exact method to retrieve the URI(s). Is initialized at first instantiation.
 73    """
 74
 75    # list of attributes that allow for lists of values and should be treated as such    
 76    _list = [ "rel", "rev", "property", "typeof", "role" ]
 77    # mapping table from attribute name to the exact method to retrieve the URI(s).
 78    _resource_type = {}
 79
 80    def __init__(self, node, graph, inherited_state=None, base="", options=None, rdfa_version=None):
 81        """
 82        @param node: the current DOM Node
 83        @param graph: the RDFLib Graph
 84        @keyword inherited_state: the state as inherited
 85        from upper layers. This inherited_state is mixed with the state information
 86        retrieved from the current node.
 87        @type inherited_state: L{state.ExecutionContext}
 88        @keyword base: string denoting the base URI for the specific node. This overrides the possible
 89        base inherited from the upper layers. The 
 90        current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is
 91        necessary for SVG (and other possible XML dialects that accept C{@xml:base})
 92        @keyword options: invocation options, and references to warning graphs
 93        @type options: L{Options<pyRdfa.options>}
 94        """
 95        def remove_frag_id(uri):
 96            """
 97            The fragment ID for self.base must be removed
 98            """
 99            try:
100                # To be on the safe side:-)
101                t = urlparse(uri)
102                return urlunparse((t[0],t[1],t[2],t[3],t[4],""))
103            except:
104                return uri
105
106        # This is, conceptually, an additional class initialization, but it must be done run time, otherwise import errors show up
107        if len(ExecutionContext._resource_type) == 0 :
108            ExecutionContext._resource_type = {
109                "href"        :    ExecutionContext._URI,
110                "src"        :    ExecutionContext._URI,
111                "vocab"        :   ExecutionContext._URI,
112
113                "about"        :    ExecutionContext._CURIEorURI, 
114                "resource"    :    ExecutionContext._CURIEorURI, 
115
116                "rel"        :    ExecutionContext._TERMorCURIEorAbsURI,
117                "rev"        :    ExecutionContext._TERMorCURIEorAbsURI,
118                "datatype"    :    ExecutionContext._TERMorCURIEorAbsURI,
119                "typeof"    :    ExecutionContext._TERMorCURIEorAbsURI,
120                "property"    :    ExecutionContext._TERMorCURIEorAbsURI,
121                "role"        :    ExecutionContext._TERMorCURIEorAbsURI,
122            }
123        #-----------------------------------------------------------------
124        self.node = node
125        
126        #-----------------------------------------------------------------
127        # Settling the base. In a generic XML, xml:base should be accepted at all levels (though this is not the
128        # case in, say, XHTML...)
129        # At the moment, it is invoked with a 'None' at the top level of parsing, that is
130        # when the <base> element is looked for (for the HTML cases, that is)
131        if inherited_state:
132            self.rdfa_version = inherited_state.rdfa_version
133            self.base = inherited_state.base
134            self.options = inherited_state.options
135
136            self.list_mapping = inherited_state.list_mapping
137            self.new_list = False
138            
139            # for generic XML versions the xml:base attribute should be handled
140            if self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"):
141                self.base = remove_frag_id(node.getAttribute("xml:base"))
142        else:
143            # this is the branch called from the very top            
144            self.list_mapping = ListStructure()
145            self.new_list = True
146
147            if rdfa_version is not None:
148                self.rdfa_version = rdfa_version
149            else:
150                from . import rdfa_current_version                
151                self.rdfa_version = rdfa_current_version
152
153            # This value can be overwritten by a @version attribute
154            if node.hasAttribute("version"):
155                top_version = node.getAttribute("version")
156                if top_version.find("RDFa 1.0") != -1 or top_version.find("RDFa1.0") != -1:
157                    self.rdfa_version = "1.0"
158                elif top_version.find("RDFa 1.1") != -1 or top_version.find("RDFa1.1") != -1:
159                    self.rdfa_version = "1.1"                        
160
161            # this is just to play safe. I believe this should actually not happen...
162            if options == None:
163                from . import Options
164                self.options = Options()
165            else:
166                self.options = options
167
168            self.base = ""
169            # handle the base element case for HTML
170            if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.html5, HostLanguage.xhtml5  ]:
171                for bases in node.getElementsByTagName("base"):
172                    if bases.hasAttribute("href"):
173                        self.base = remove_frag_id(bases.getAttribute("href"))
174                        continue
175            elif self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"):
176                self.base = remove_frag_id(node.getAttribute("xml:base"))
177                
178            # If no local setting for base occurs, the input argument has it
179            if self.base == "":
180                self.base = base
181                
182            # Perform an extra beautification in RDFLib
183            if self.options.host_language in beautifying_prefixes:
184                values = beautifying_prefixes[self.options.host_language]
185                for key in values:
186                    graph.bind(key, values[key])
187
188            input_info = "Input Host Language:%s, RDFa version:%s, base:%s" % (self.options.host_language, self.rdfa_version, self.base)
189            self.options.add_info(input_info)
190
191        #-----------------------------------------------------------------
192        # this will be used repeatedly, better store it once and for all...        
193        self.parsedBase = urlsplit(self.base)
194
195        #-----------------------------------------------------------------
196        # generate and store the local CURIE handling class instance
197        self.term_or_curie = TermOrCurie(self, graph, inherited_state)
198
199        #-----------------------------------------------------------------
200        # Settling the language tags
201        # @lang has priority over @xml:lang
202        # it is a bit messy: the three fundamental modes (xhtml, html, or xml) are all slightly different:-(
203        # first get the inherited state's language, if any
204        if inherited_state:
205            self.lang = inherited_state.lang
206        else:
207            self.lang = None
208            
209        self.supress_lang = False
210            
211            
212        if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.xhtml5, HostLanguage.html5 ]:
213            # we may have lang and xml:lang
214            if node.hasAttribute("lang"):
215                lang = node.getAttribute("lang").lower()
216            else:
217                lang = None
218            if node.hasAttribute("xml:lang"):
219                xmllang = node.getAttribute("xml:lang").lower()
220            else:
221                xmllang = None
222            # First of all, set the value, if any
223            if xmllang != None:
224                # this has priority
225                if len(xmllang) != 0:
226                    self.lang = xmllang
227                else:
228                    self.lang = None
229            elif lang != None:
230                if len(lang) != 0:
231                    self.lang = lang
232                else:
233                    self.lang = None
234            # Ideally, a warning should be generated if lang and xmllang are both present with different values. But
235            # the HTML5 Parser does its magic by overriding a lang value if xmllang is present, so the potential
236            # error situations are simply swallowed...
237                
238        elif self.options.host_language in accept_xml_lang and node.hasAttribute("xml:lang"):
239                self.lang = node.getAttribute("xml:lang").lower()
240                if len(self.lang) == 0:
241                    self.lang = None
242            
243        #-----------------------------------------------------------------
244        # Set the default namespace. Used when generating XML Literals
245        if node.hasAttribute("xmlns"):
246            self.defaultNS = node.getAttribute("xmlns")
247        elif inherited_state and inherited_state.defaultNS != None:
248            self.defaultNS = inherited_state.defaultNS
249        else:
250            self.defaultNS = None
251    # end __init__
252
253    def _URI(self, val):
254        """Returns a URI for a 'pure' URI (ie, not a CURIE). The method resolves possible relative URI-s. It also
255        checks whether the URI uses an unusual URI scheme (and issues a warning); this may be the result of an
256        uninterpreted CURIE...
257        @param val: attribute value to be interpreted
258        @type val: string
259        @return: an RDFLib URIRef instance
260        """
261        def create_URIRef(uri, check=True):
262            """
263            Mini helping function: it checks whether a uri is using a usual scheme before a URIRef is created. In case
264            there is something unusual, a warning is generated (though the URIRef is created nevertheless)
265            @param uri: (absolute) URI string
266            @return: an RDFLib URIRef instance
267            """
268            from .    import uri_schemes
269            val = uri.strip()
270            if check and urlsplit(val)[0] not in uri_schemes:
271                self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName)
272            return URIRef(val)
273
274        def join(base, v, check=True):
275            """
276            Mini helping function: it makes a urljoin for the paths. Based on the python library, but
277            that one has a bug: in some cases it
278            swallows the '#' or '?' character at the end. This is clearly a problem with
279            Semantic Web URI-s, so this is checked, too
280            @param base: base URI string
281            @param v: local part
282            @param check: whether the URI should be checked against the list of 'existing' URI schemes
283            @return: an RDFLib URIRef instance
284            """
285            
286            joined = urljoin(base, v)
287            try:
288                if v[-1] != joined[-1] and (v[-1] == "#" or v[-1] == "?"):
289                    return create_URIRef(joined + v[-1], check)
290                else:
291                    return create_URIRef(joined, check)
292            except:
293                return create_URIRef(joined, check)
294
295        if val == "":
296            # The fragment ID must be removed...
297            return URIRef(self.base)
298            
299        # fall back on good old traditional URI-s.
300        # To be on the safe side, let us use the Python libraries
301        if self.parsedBase[0] == "":
302            # base is, in fact, a local file name
303            # The following call is just to be sure that some pathological cases when
304            # the ':' _does_ appear in the URI but not in a scheme position is taken
305            # care of properly...
306            
307            key = urlsplit(val)[0]
308            if key == "":
309                # relative URI, to be combined with local file name:
310                return join(self.base, val, check = False)
311            else:
312                return create_URIRef(val)
313        else:
314            # Trust the python library...
315            # Well, not quite:-) there is what is, in my view, a bug in the urljoin; in some cases it
316            # swallows the '#' or '?' character at the end. This is clearly a problem with
317            # Semantic Web URI-s            
318            return join(self.base, val)
319    # end _URI
320
321    def _CURIEorURI(self, val):
322        """Returns a URI for a (safe or not safe) CURIE. In case it is a safe CURIE but the CURIE itself
323        is not defined, an error message is issued. Otherwise, if it is not a CURIE, it is taken to be a URI
324        @param val: attribute value to be interpreted
325        @type val: string
326        @return: an RDFLib URIRef instance or None
327        """
328        if val == "":
329            return URIRef(self.base)
330
331        safe_curie = False
332        if val[0] == '[':
333            # If a safe CURIE is asked for, a pure URI is not acceptable.
334            # Is checked below, and that is why the safe_curie flag is necessary
335            if val[-1] != ']':
336                # that is certainly forbidden: an incomplete safe CURIE
337                self.options.add_warning(err_illegal_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName)
338                return None
339            else:
340                val = val[1:-1]
341                safe_curie = True
342        # There is a branch here depending on whether we are in 1.1 or 1.0 mode
343        if self.rdfa_version >= "1.1":
344            retval = self.term_or_curie.CURIE_to_URI(val)
345            if retval == None:
346                # the value could not be interpreted as a CURIE, ie, it did not produce any valid URI.
347                # The rule says that then the whole value should be considered as a URI
348                # except if it was part of a safe CURIE. In that case it should be ignored...
349                if safe_curie:
350                    self.options.add_warning(err_no_CURIE_in_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName)
351                    return None
352                else:
353                    return self._URI(val)
354            else:
355                # there is an unlikely case where the retval is actually a URIRef with a relative URI. Better filter that one out
356                if isinstance(retval, BNode) == False and urlsplit(str(retval))[0] == "":
357                    # yep, there is something wrong, a new URIRef has to be created:
358                    return URIRef(self.base+str(retval))
359                else:
360                    return retval
361        else:
362            # in 1.0 mode a CURIE can be considered only in case of a safe CURIE
363            if safe_curie:
364                return self.term_or_curie.CURIE_to_URI(val)
365            else:
366                return self._URI(val)
367    # end _CURIEorURI
368
369    def _TERMorCURIEorAbsURI(self, val):
370        """Returns a URI either for a term or for a CURIE. The value must be an NCNAME to be handled as a term; otherwise
371        the method falls back on a CURIE or an absolute URI.
372        @param val: attribute value to be interpreted
373        @type val: string
374        @return: an RDFLib URIRef instance or None
375        """
376        from . import uri_schemes
377        # This case excludes the pure base, ie, the empty value
378        if val == "":
379            return None
380        
381        from .termorcurie import termname
382        if termname.match(val):
383            # This is a term, must be handled as such...            
384            retval = self.term_or_curie.term_to_URI(val)
385            if not retval:
386                self.options.add_warning(err_undefined_terms % val, UnresolvableTerm, node=self.node.nodeName, buggy_value = val)
387                return None
388            else:
389                return retval
390        else:
391            # try a CURIE
392            retval = self.term_or_curie.CURIE_to_URI(val)
393            if retval:
394                return retval
395            elif self.rdfa_version >= "1.1":
396                # See if it is an absolute URI
397                scheme = urlsplit(val)[0]
398                if scheme == "":
399                    # bug; there should be no relative URIs here
400                    self.options.add_warning(err_non_legal_CURIE_ref % val, UnresolvablePrefix, node=self.node.nodeName)
401                    return None
402                else:
403                    if scheme not in uri_schemes:
404                        self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName)
405                    return URIRef(val)
406            else:
407                # rdfa 1.0 case
408                self.options.add_warning(err_undefined_CURIE % val.strip(), UnresolvablePrefix, node=self.node.nodeName)
409                return None
410    # end _TERMorCURIEorAbsURI
411
412    # -----------------------------------------------------------------------------------------------
413
414    def getURI(self, attr):
415        """Get the URI(s) for the attribute. The name of the attribute determines whether the value should be
416        a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done
417        using the L{ExecutionContext._resource_type} table.
418        @param attr: attribute name
419        @type attr: string
420        @return: an RDFLib URIRef instance (or None) or a list of those
421        """
422        if self.node.hasAttribute(attr):
423            val = self.node.getAttribute(attr)
424        else:
425            if attr in ExecutionContext._list:
426                return []
427            else:
428                return None
429        
430        # This may raise an exception if the attr has no key. This, actually,
431        # should not happen if the code is correct, but it does not harm having it here...
432        try:
433            func = ExecutionContext._resource_type[attr]
434        except:
435            # Actually, this should not happen...
436            func = ExecutionContext._URI
437        
438        if attr in ExecutionContext._list:
439            # Allows for a list
440            resources = [ func(self, v.strip()) for v in val.strip().split() if v != None ]
441            retval = [ r for r in resources if r != None ]
442        else:
443            retval = func(self, val.strip())
444        return retval
445    # end getURI
446    
447    def getResource(self, *args):
448        """Get single resources from several different attributes. The first one that returns a valid URI wins.
449        @param args: variable list of attribute names, or a single attribute being a list itself.
450        @return: an RDFLib URIRef instance (or None):
451        """
452        if len(args) == 0:
453            return None
454        if isinstance(args[0], tuple) or isinstance(args[0], list):
455            rargs = args[0]
456        else:
457            rargs = args
458            
459        for resource in rargs:
460            uri = self.getURI(resource)
461            if uri != None : return uri
462        return None
463    
464    # -----------------------------------------------------------------------------------------------
465    def reset_list_mapping(self, origin=None):
466        """
467        Reset, ie, create a new empty dictionary for the list mapping.
468        """
469        self.list_mapping = ListStructure()
470        if origin: self.set_list_origin(origin)
471        self.new_list = True
472
473    def list_empty(self):
474        """
475        Checks whether the list is empty.
476        @return: Boolean
477        """
478        return len(self.list_mapping.mapping) == 0
479        
480    def get_list_props(self):
481        """
482        Return the list of property values in the list structure
483        @return: list of URIRef
484        """
485        return list(self.list_mapping.mapping.keys())
486        
487    def get_list_value(self,prop):
488        """
489        Return the list of values in the list structure for a specific property
490        @return: list of RDF nodes
491        """
492        return self.list_mapping.mapping[prop]
493        
494    def set_list_origin(self, origin):
495        """
496        Set the origin of the list, ie, the subject to attach the final list(s) to
497        @param origin: URIRef
498        """        
499        self.list_mapping.origin = origin
500        
501    def get_list_origin(self):
502        """
503        Return the origin of the list, ie, the subject to attach the final list(s) to
504        @return: URIRef
505        """        
506        return self.list_mapping.origin
507        
508    def add_to_list_mapping(self, prop, resource):
509        """Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays;
510        if the array does not exist yet, it will be created on the fly.
511        
512        @param prop: the property URI, used as a key in the dictionary
513        @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy
514        placeholder for C{<span rel="property" inlist>...</span>} constructions that may be filled in by children or siblings; if not
515        an empty list has to be generated.
516        """
517        if prop in self.list_mapping.mapping:
518            if resource != None:
519                # indeed, if it is None, than it should not override anything
520                if self.list_mapping.mapping[prop] == None:
521                    # replacing a dummy with real content
522                    self.list_mapping.mapping[prop] = [ resource ]
523                else :            
524                    self.list_mapping.mapping[prop].append(resource)
525        else:
526            if resource != None:
527                self.list_mapping.mapping[prop] = [ resource ]
528            else:
529                self.list_mapping.mapping[prop] = None

State at a specific node, including the current set of namespaces in the RDFLib sense, current language, the base, vocabularies, etc. The class is also used to interpret URI-s and CURIE-s to produce URI references for RDFLib.

@ivar options: reference to the overall options @type options: L{Options} @ivar base: the 'base' URI @ivar parsedBase: the parsed version of base, as produced by urlparse.urlsplit @ivar defaultNS: default namespace (if defined via @xmlns) to be used for XML Literals @ivar lang: language tag (possibly None) @ivar term_or_curie: vocabulary management class instance @type term_or_curie: L{termorcurie.TermOrCurie} @ivar list_mapping: dictionary of arrays, containing a list of URIs key-ed via properties for lists @ivar node: the node to which this state belongs @type node: DOM node instance @ivar rdfa_version: RDFa version of the content @type rdfa_version: String @ivar supress_lang: in some cases, the effect of the lang attribute should be supressed for the given node, although it should be inherited down below (example: @value attribute of the data element in HTML5) @type supress_lang: Boolean @cvar _list: list of attributes that allow for lists of values and should be treated as such @cvar _resource_type: dictionary; mapping table from attribute name to the exact method to retrieve the URI(s). Is initialized at first instantiation.

ExecutionContext( node, graph, inherited_state=None, base='', options=None, rdfa_version=None)
 80    def __init__(self, node, graph, inherited_state=None, base="", options=None, rdfa_version=None):
 81        """
 82        @param node: the current DOM Node
 83        @param graph: the RDFLib Graph
 84        @keyword inherited_state: the state as inherited
 85        from upper layers. This inherited_state is mixed with the state information
 86        retrieved from the current node.
 87        @type inherited_state: L{state.ExecutionContext}
 88        @keyword base: string denoting the base URI for the specific node. This overrides the possible
 89        base inherited from the upper layers. The 
 90        current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is
 91        necessary for SVG (and other possible XML dialects that accept C{@xml:base})
 92        @keyword options: invocation options, and references to warning graphs
 93        @type options: L{Options<pyRdfa.options>}
 94        """
 95        def remove_frag_id(uri):
 96            """
 97            The fragment ID for self.base must be removed
 98            """
 99            try:
100                # To be on the safe side:-)
101                t = urlparse(uri)
102                return urlunparse((t[0],t[1],t[2],t[3],t[4],""))
103            except:
104                return uri
105
106        # This is, conceptually, an additional class initialization, but it must be done run time, otherwise import errors show up
107        if len(ExecutionContext._resource_type) == 0 :
108            ExecutionContext._resource_type = {
109                "href"        :    ExecutionContext._URI,
110                "src"        :    ExecutionContext._URI,
111                "vocab"        :   ExecutionContext._URI,
112
113                "about"        :    ExecutionContext._CURIEorURI, 
114                "resource"    :    ExecutionContext._CURIEorURI, 
115
116                "rel"        :    ExecutionContext._TERMorCURIEorAbsURI,
117                "rev"        :    ExecutionContext._TERMorCURIEorAbsURI,
118                "datatype"    :    ExecutionContext._TERMorCURIEorAbsURI,
119                "typeof"    :    ExecutionContext._TERMorCURIEorAbsURI,
120                "property"    :    ExecutionContext._TERMorCURIEorAbsURI,
121                "role"        :    ExecutionContext._TERMorCURIEorAbsURI,
122            }
123        #-----------------------------------------------------------------
124        self.node = node
125        
126        #-----------------------------------------------------------------
127        # Settling the base. In a generic XML, xml:base should be accepted at all levels (though this is not the
128        # case in, say, XHTML...)
129        # At the moment, it is invoked with a 'None' at the top level of parsing, that is
130        # when the <base> element is looked for (for the HTML cases, that is)
131        if inherited_state:
132            self.rdfa_version = inherited_state.rdfa_version
133            self.base = inherited_state.base
134            self.options = inherited_state.options
135
136            self.list_mapping = inherited_state.list_mapping
137            self.new_list = False
138            
139            # for generic XML versions the xml:base attribute should be handled
140            if self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"):
141                self.base = remove_frag_id(node.getAttribute("xml:base"))
142        else:
143            # this is the branch called from the very top            
144            self.list_mapping = ListStructure()
145            self.new_list = True
146
147            if rdfa_version is not None:
148                self.rdfa_version = rdfa_version
149            else:
150                from . import rdfa_current_version                
151                self.rdfa_version = rdfa_current_version
152
153            # This value can be overwritten by a @version attribute
154            if node.hasAttribute("version"):
155                top_version = node.getAttribute("version")
156                if top_version.find("RDFa 1.0") != -1 or top_version.find("RDFa1.0") != -1:
157                    self.rdfa_version = "1.0"
158                elif top_version.find("RDFa 1.1") != -1 or top_version.find("RDFa1.1") != -1:
159                    self.rdfa_version = "1.1"                        
160
161            # this is just to play safe. I believe this should actually not happen...
162            if options == None:
163                from . import Options
164                self.options = Options()
165            else:
166                self.options = options
167
168            self.base = ""
169            # handle the base element case for HTML
170            if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.html5, HostLanguage.xhtml5  ]:
171                for bases in node.getElementsByTagName("base"):
172                    if bases.hasAttribute("href"):
173                        self.base = remove_frag_id(bases.getAttribute("href"))
174                        continue
175            elif self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"):
176                self.base = remove_frag_id(node.getAttribute("xml:base"))
177                
178            # If no local setting for base occurs, the input argument has it
179            if self.base == "":
180                self.base = base
181                
182            # Perform an extra beautification in RDFLib
183            if self.options.host_language in beautifying_prefixes:
184                values = beautifying_prefixes[self.options.host_language]
185                for key in values:
186                    graph.bind(key, values[key])
187
188            input_info = "Input Host Language:%s, RDFa version:%s, base:%s" % (self.options.host_language, self.rdfa_version, self.base)
189            self.options.add_info(input_info)
190
191        #-----------------------------------------------------------------
192        # this will be used repeatedly, better store it once and for all...        
193        self.parsedBase = urlsplit(self.base)
194
195        #-----------------------------------------------------------------
196        # generate and store the local CURIE handling class instance
197        self.term_or_curie = TermOrCurie(self, graph, inherited_state)
198
199        #-----------------------------------------------------------------
200        # Settling the language tags
201        # @lang has priority over @xml:lang
202        # it is a bit messy: the three fundamental modes (xhtml, html, or xml) are all slightly different:-(
203        # first get the inherited state's language, if any
204        if inherited_state:
205            self.lang = inherited_state.lang
206        else:
207            self.lang = None
208            
209        self.supress_lang = False
210            
211            
212        if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.xhtml5, HostLanguage.html5 ]:
213            # we may have lang and xml:lang
214            if node.hasAttribute("lang"):
215                lang = node.getAttribute("lang").lower()
216            else:
217                lang = None
218            if node.hasAttribute("xml:lang"):
219                xmllang = node.getAttribute("xml:lang").lower()
220            else:
221                xmllang = None
222            # First of all, set the value, if any
223            if xmllang != None:
224                # this has priority
225                if len(xmllang) != 0:
226                    self.lang = xmllang
227                else:
228                    self.lang = None
229            elif lang != None:
230                if len(lang) != 0:
231                    self.lang = lang
232                else:
233                    self.lang = None
234            # Ideally, a warning should be generated if lang and xmllang are both present with different values. But
235            # the HTML5 Parser does its magic by overriding a lang value if xmllang is present, so the potential
236            # error situations are simply swallowed...
237                
238        elif self.options.host_language in accept_xml_lang and node.hasAttribute("xml:lang"):
239                self.lang = node.getAttribute("xml:lang").lower()
240                if len(self.lang) == 0:
241                    self.lang = None
242            
243        #-----------------------------------------------------------------
244        # Set the default namespace. Used when generating XML Literals
245        if node.hasAttribute("xmlns"):
246            self.defaultNS = node.getAttribute("xmlns")
247        elif inherited_state and inherited_state.defaultNS != None:
248            self.defaultNS = inherited_state.defaultNS
249        else:
250            self.defaultNS = None

@param node: the current DOM Node @param graph: the RDFLib Graph @keyword inherited_state: the state as inherited from upper layers. This inherited_state is mixed with the state information retrieved from the current node. @type inherited_state: L{state.ExecutionContext} @keyword base: string denoting the base URI for the specific node. This overrides the possible base inherited from the upper layers. The current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is necessary for SVG (and other possible XML dialects that accept C{@xml:base}) @keyword options: invocation options, and references to warning graphs @type options: L{Options<pyRdfa.options>}

node
parsedBase
term_or_curie
supress_lang
def getURI(self, attr):
414    def getURI(self, attr):
415        """Get the URI(s) for the attribute. The name of the attribute determines whether the value should be
416        a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done
417        using the L{ExecutionContext._resource_type} table.
418        @param attr: attribute name
419        @type attr: string
420        @return: an RDFLib URIRef instance (or None) or a list of those
421        """
422        if self.node.hasAttribute(attr):
423            val = self.node.getAttribute(attr)
424        else:
425            if attr in ExecutionContext._list:
426                return []
427            else:
428                return None
429        
430        # This may raise an exception if the attr has no key. This, actually,
431        # should not happen if the code is correct, but it does not harm having it here...
432        try:
433            func = ExecutionContext._resource_type[attr]
434        except:
435            # Actually, this should not happen...
436            func = ExecutionContext._URI
437        
438        if attr in ExecutionContext._list:
439            # Allows for a list
440            resources = [ func(self, v.strip()) for v in val.strip().split() if v != None ]
441            retval = [ r for r in resources if r != None ]
442        else:
443            retval = func(self, val.strip())
444        return retval

Get the URI(s) for the attribute. The name of the attribute determines whether the value should be a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done using the L{ExecutionContext._resource_type} table. @param attr: attribute name @type attr: string @return: an RDFLib URIRef instance (or None) or a list of those

def getResource(self, *args):
447    def getResource(self, *args):
448        """Get single resources from several different attributes. The first one that returns a valid URI wins.
449        @param args: variable list of attribute names, or a single attribute being a list itself.
450        @return: an RDFLib URIRef instance (or None):
451        """
452        if len(args) == 0:
453            return None
454        if isinstance(args[0], tuple) or isinstance(args[0], list):
455            rargs = args[0]
456        else:
457            rargs = args
458            
459        for resource in rargs:
460            uri = self.getURI(resource)
461            if uri != None : return uri
462        return None

Get single resources from several different attributes. The first one that returns a valid URI wins. @param args: variable list of attribute names, or a single attribute being a list itself. @return: an RDFLib URIRef instance (or None):

def reset_list_mapping(self, origin=None):
465    def reset_list_mapping(self, origin=None):
466        """
467        Reset, ie, create a new empty dictionary for the list mapping.
468        """
469        self.list_mapping = ListStructure()
470        if origin: self.set_list_origin(origin)
471        self.new_list = True

Reset, ie, create a new empty dictionary for the list mapping.

def list_empty(self):
473    def list_empty(self):
474        """
475        Checks whether the list is empty.
476        @return: Boolean
477        """
478        return len(self.list_mapping.mapping) == 0

Checks whether the list is empty. @return: Boolean

def get_list_props(self):
480    def get_list_props(self):
481        """
482        Return the list of property values in the list structure
483        @return: list of URIRef
484        """
485        return list(self.list_mapping.mapping.keys())

Return the list of property values in the list structure @return: list of URIRef

def get_list_value(self, prop):
487    def get_list_value(self,prop):
488        """
489        Return the list of values in the list structure for a specific property
490        @return: list of RDF nodes
491        """
492        return self.list_mapping.mapping[prop]

Return the list of values in the list structure for a specific property @return: list of RDF nodes

def set_list_origin(self, origin):
494    def set_list_origin(self, origin):
495        """
496        Set the origin of the list, ie, the subject to attach the final list(s) to
497        @param origin: URIRef
498        """        
499        self.list_mapping.origin = origin

Set the origin of the list, ie, the subject to attach the final list(s) to @param origin: URIRef

def get_list_origin(self):
501    def get_list_origin(self):
502        """
503        Return the origin of the list, ie, the subject to attach the final list(s) to
504        @return: URIRef
505        """        
506        return self.list_mapping.origin

Return the origin of the list, ie, the subject to attach the final list(s) to @return: URIRef

def add_to_list_mapping(self, prop, resource):
508    def add_to_list_mapping(self, prop, resource):
509        """Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays;
510        if the array does not exist yet, it will be created on the fly.
511        
512        @param prop: the property URI, used as a key in the dictionary
513        @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy
514        placeholder for C{<span rel="property" inlist>...</span>} constructions that may be filled in by children or siblings; if not
515        an empty list has to be generated.
516        """
517        if prop in self.list_mapping.mapping:
518            if resource != None:
519                # indeed, if it is None, than it should not override anything
520                if self.list_mapping.mapping[prop] == None:
521                    # replacing a dummy with real content
522                    self.list_mapping.mapping[prop] = [ resource ]
523                else :            
524                    self.list_mapping.mapping[prop].append(resource)
525        else:
526            if resource != None:
527                self.list_mapping.mapping[prop] = [ resource ]
528            else:
529                self.list_mapping.mapping[prop] = None

Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays; if the array does not exist yet, it will be created on the fly.

@param prop: the property URI, used as a key in the dictionary @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy placeholder for C{...} constructions that may be filled in by children or siblings; if not an empty list has to be generated.