pyRdfa.state
Parser's execution context (a.k.a. state) object and handling. The state includes:
- language, retrieved from C{@xml:lang} or C{@lang}
- URI base, determined by C{
} or set explicitly. This is a little bit superfluous, because the current RDFa syntax does not make use of C{@xml:base}; i.e., this could be a global value. But the structure is prepared to add C{@xml:base} easily, if needed. - options, in the form of an L{options<pyRdfa.options>} instance
- a separate vocabulary/CURIE handling resource, in the form of an L{termorcurie
} instance
The execution context object is also used to handle URI-s, CURIE-s, terms, etc.
@summary: RDFa parser execution context
@organization: U{World Wide Web Consortiumhttp://www.w3.org}
@author: U{Ivan Herman}
@license: This software is available for use under the
U{W3C® SOFTWARE NOTICE AND LICENSE
1# -*- coding: utf-8 -*- 2""" 3Parser's execution context (a.k.a. state) object and handling. The state includes: 4 5 - language, retrieved from C{@xml:lang} or C{@lang} 6 - URI base, determined by C{<base>} or set explicitly. This is a little bit superfluous, because the current RDFa syntax does not make use of C{@xml:base}; i.e., this could be a global value. But the structure is prepared to add C{@xml:base} easily, if needed. 7 - options, in the form of an L{options<pyRdfa.options>} instance 8 - a separate vocabulary/CURIE handling resource, in the form of an L{termorcurie<pyRdfa.TermOrCurie>} instance 9 10The execution context object is also used to handle URI-s, CURIE-s, terms, etc. 11 12@summary: RDFa parser execution context 13@organization: U{World Wide Web Consortium<http://www.w3.org>} 14@author: U{Ivan Herman<a href="http://www.w3.org/People/Ivan/">} 15@license: This software is available for use under the 16U{W3C® SOFTWARE NOTICE AND LICENSE<href="http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231">} 17""" 18 19""" 20$Id: state.py,v 1.23 2013-10-16 11:48:54 ivan Exp $ 21$Date: 2013-10-16 11:48:54 $ 22""" 23 24from rdflib import URIRef 25from rdflib import BNode 26 27from .host import HostLanguage, accept_xml_base, accept_xml_lang, beautifying_prefixes 28 29from .termorcurie import TermOrCurie 30from . import UnresolvablePrefix, UnresolvableTerm 31 32from . import err_URI_scheme 33from . import err_illegal_safe_CURIE 34from . import err_no_CURIE_in_safe_CURIE 35from . import err_undefined_terms 36from . import err_non_legal_CURIE_ref 37from . import err_undefined_CURIE 38 39from urllib.parse import urlparse, urlunparse, urlsplit, urljoin 40 41class ListStructure: 42 """Special class to handle the C{@inlist} type structures in RDFa 1.1; stores the "origin", i.e, 43 where the list will be attached to, and the mappings as defined in the spec. 44 """ 45 def __init__(self): 46 self.mapping = {} 47 self.origin = None 48 49#### Core Class definition 50class ExecutionContext: 51 """State at a specific node, including the current set of namespaces in the RDFLib sense, current language, 52 the base, vocabularies, etc. The class is also used to interpret URI-s and CURIE-s to produce 53 URI references for RDFLib. 54 55 @ivar options: reference to the overall options 56 @type options: L{Options} 57 @ivar base: the 'base' URI 58 @ivar parsedBase: the parsed version of base, as produced by urlparse.urlsplit 59 @ivar defaultNS: default namespace (if defined via @xmlns) to be used for XML Literals 60 @ivar lang: language tag (possibly None) 61 @ivar term_or_curie: vocabulary management class instance 62 @type term_or_curie: L{termorcurie.TermOrCurie} 63 @ivar list_mapping: dictionary of arrays, containing a list of URIs key-ed via properties for lists 64 @ivar node: the node to which this state belongs 65 @type node: DOM node instance 66 @ivar rdfa_version: RDFa version of the content 67 @type rdfa_version: String 68 @ivar supress_lang: in some cases, the effect of the lang attribute should be supressed for the given node, although it should be inherited down below (example: @value attribute of the data element in HTML5) 69 @type supress_lang: Boolean 70 @cvar _list: list of attributes that allow for lists of values and should be treated as such 71 @cvar _resource_type: dictionary; mapping table from attribute name to the exact method to retrieve the URI(s). Is initialized at first instantiation. 72 """ 73 74 # list of attributes that allow for lists of values and should be treated as such 75 _list = [ "rel", "rev", "property", "typeof", "role" ] 76 # mapping table from attribute name to the exact method to retrieve the URI(s). 77 _resource_type = {} 78 79 def __init__(self, node, graph, inherited_state=None, base="", options=None, rdfa_version=None): 80 """ 81 @param node: the current DOM Node 82 @param graph: the RDFLib Graph 83 @keyword inherited_state: the state as inherited 84 from upper layers. This inherited_state is mixed with the state information 85 retrieved from the current node. 86 @type inherited_state: L{state.ExecutionContext} 87 @keyword base: string denoting the base URI for the specific node. This overrides the possible 88 base inherited from the upper layers. The 89 current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is 90 necessary for SVG (and other possible XML dialects that accept C{@xml:base}) 91 @keyword options: invocation options, and references to warning graphs 92 @type options: L{Options<pyRdfa.options>} 93 """ 94 def remove_frag_id(uri): 95 """ 96 The fragment ID for self.base must be removed 97 """ 98 try: 99 # To be on the safe side:-) 100 t = urlparse(uri) 101 return urlunparse((t[0],t[1],t[2],t[3],t[4],"")) 102 except: 103 return uri 104 105 # This is, conceptually, an additional class initialization, but it must be done run time, otherwise import errors show up 106 if len(ExecutionContext._resource_type) == 0 : 107 ExecutionContext._resource_type = { 108 "href" : ExecutionContext._URI, 109 "src" : ExecutionContext._URI, 110 "vocab" : ExecutionContext._URI, 111 112 "about" : ExecutionContext._CURIEorURI, 113 "resource" : ExecutionContext._CURIEorURI, 114 115 "rel" : ExecutionContext._TERMorCURIEorAbsURI, 116 "rev" : ExecutionContext._TERMorCURIEorAbsURI, 117 "datatype" : ExecutionContext._TERMorCURIEorAbsURI, 118 "typeof" : ExecutionContext._TERMorCURIEorAbsURI, 119 "property" : ExecutionContext._TERMorCURIEorAbsURI, 120 "role" : ExecutionContext._TERMorCURIEorAbsURI, 121 } 122 #----------------------------------------------------------------- 123 self.node = node 124 125 #----------------------------------------------------------------- 126 # Settling the base. In a generic XML, xml:base should be accepted at all levels (though this is not the 127 # case in, say, XHTML...) 128 # At the moment, it is invoked with a 'None' at the top level of parsing, that is 129 # when the <base> element is looked for (for the HTML cases, that is) 130 if inherited_state: 131 self.rdfa_version = inherited_state.rdfa_version 132 self.base = inherited_state.base 133 self.options = inherited_state.options 134 135 self.list_mapping = inherited_state.list_mapping 136 self.new_list = False 137 138 # for generic XML versions the xml:base attribute should be handled 139 if self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"): 140 self.base = remove_frag_id(node.getAttribute("xml:base")) 141 else: 142 # this is the branch called from the very top 143 self.list_mapping = ListStructure() 144 self.new_list = True 145 146 if rdfa_version is not None: 147 self.rdfa_version = rdfa_version 148 else: 149 from . import rdfa_current_version 150 self.rdfa_version = rdfa_current_version 151 152 # This value can be overwritten by a @version attribute 153 if node.hasAttribute("version"): 154 top_version = node.getAttribute("version") 155 if top_version.find("RDFa 1.0") != -1 or top_version.find("RDFa1.0") != -1: 156 self.rdfa_version = "1.0" 157 elif top_version.find("RDFa 1.1") != -1 or top_version.find("RDFa1.1") != -1: 158 self.rdfa_version = "1.1" 159 160 # this is just to play safe. I believe this should actually not happen... 161 if options == None: 162 from . import Options 163 self.options = Options() 164 else: 165 self.options = options 166 167 self.base = "" 168 # handle the base element case for HTML 169 if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.html5, HostLanguage.xhtml5 ]: 170 for bases in node.getElementsByTagName("base"): 171 if bases.hasAttribute("href"): 172 self.base = remove_frag_id(bases.getAttribute("href")) 173 continue 174 elif self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"): 175 self.base = remove_frag_id(node.getAttribute("xml:base")) 176 177 # If no local setting for base occurs, the input argument has it 178 if self.base == "": 179 self.base = base 180 181 # Perform an extra beautification in RDFLib 182 if self.options.host_language in beautifying_prefixes: 183 values = beautifying_prefixes[self.options.host_language] 184 for key in values: 185 graph.bind(key, values[key]) 186 187 input_info = "Input Host Language:%s, RDFa version:%s, base:%s" % (self.options.host_language, self.rdfa_version, self.base) 188 self.options.add_info(input_info) 189 190 #----------------------------------------------------------------- 191 # this will be used repeatedly, better store it once and for all... 192 self.parsedBase = urlsplit(self.base) 193 194 #----------------------------------------------------------------- 195 # generate and store the local CURIE handling class instance 196 self.term_or_curie = TermOrCurie(self, graph, inherited_state) 197 198 #----------------------------------------------------------------- 199 # Settling the language tags 200 # @lang has priority over @xml:lang 201 # it is a bit messy: the three fundamental modes (xhtml, html, or xml) are all slightly different:-( 202 # first get the inherited state's language, if any 203 if inherited_state: 204 self.lang = inherited_state.lang 205 else: 206 self.lang = None 207 208 self.supress_lang = False 209 210 211 if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.xhtml5, HostLanguage.html5 ]: 212 # we may have lang and xml:lang 213 if node.hasAttribute("lang"): 214 lang = node.getAttribute("lang").lower() 215 else: 216 lang = None 217 if node.hasAttribute("xml:lang"): 218 xmllang = node.getAttribute("xml:lang").lower() 219 else: 220 xmllang = None 221 # First of all, set the value, if any 222 if xmllang != None: 223 # this has priority 224 if len(xmllang) != 0: 225 self.lang = xmllang 226 else: 227 self.lang = None 228 elif lang != None: 229 if len(lang) != 0: 230 self.lang = lang 231 else: 232 self.lang = None 233 # Ideally, a warning should be generated if lang and xmllang are both present with different values. But 234 # the HTML5 Parser does its magic by overriding a lang value if xmllang is present, so the potential 235 # error situations are simply swallowed... 236 237 elif self.options.host_language in accept_xml_lang and node.hasAttribute("xml:lang"): 238 self.lang = node.getAttribute("xml:lang").lower() 239 if len(self.lang) == 0: 240 self.lang = None 241 242 #----------------------------------------------------------------- 243 # Set the default namespace. Used when generating XML Literals 244 if node.hasAttribute("xmlns"): 245 self.defaultNS = node.getAttribute("xmlns") 246 elif inherited_state and inherited_state.defaultNS != None: 247 self.defaultNS = inherited_state.defaultNS 248 else: 249 self.defaultNS = None 250 # end __init__ 251 252 def _URI(self, val): 253 """Returns a URI for a 'pure' URI (ie, not a CURIE). The method resolves possible relative URI-s. It also 254 checks whether the URI uses an unusual URI scheme (and issues a warning); this may be the result of an 255 uninterpreted CURIE... 256 @param val: attribute value to be interpreted 257 @type val: string 258 @return: an RDFLib URIRef instance 259 """ 260 def create_URIRef(uri, check=True): 261 """ 262 Mini helping function: it checks whether a uri is using a usual scheme before a URIRef is created. In case 263 there is something unusual, a warning is generated (though the URIRef is created nevertheless) 264 @param uri: (absolute) URI string 265 @return: an RDFLib URIRef instance 266 """ 267 from . import uri_schemes 268 val = uri.strip() 269 if check and urlsplit(val)[0] not in uri_schemes: 270 self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName) 271 return URIRef(val) 272 273 def join(base, v, check=True): 274 """ 275 Mini helping function: it makes a urljoin for the paths. Based on the python library, but 276 that one has a bug: in some cases it 277 swallows the '#' or '?' character at the end. This is clearly a problem with 278 Semantic Web URI-s, so this is checked, too 279 @param base: base URI string 280 @param v: local part 281 @param check: whether the URI should be checked against the list of 'existing' URI schemes 282 @return: an RDFLib URIRef instance 283 """ 284 285 joined = urljoin(base, v) 286 try: 287 if v[-1] != joined[-1] and (v[-1] == "#" or v[-1] == "?"): 288 return create_URIRef(joined + v[-1], check) 289 else: 290 return create_URIRef(joined, check) 291 except: 292 return create_URIRef(joined, check) 293 294 if val == "": 295 # The fragment ID must be removed... 296 return URIRef(self.base) 297 298 # fall back on good old traditional URI-s. 299 # To be on the safe side, let us use the Python libraries 300 if self.parsedBase[0] == "": 301 # base is, in fact, a local file name 302 # The following call is just to be sure that some pathological cases when 303 # the ':' _does_ appear in the URI but not in a scheme position is taken 304 # care of properly... 305 306 key = urlsplit(val)[0] 307 if key == "": 308 # relative URI, to be combined with local file name: 309 return join(self.base, val, check = False) 310 else: 311 return create_URIRef(val) 312 else: 313 # Trust the python library... 314 # Well, not quite:-) there is what is, in my view, a bug in the urljoin; in some cases it 315 # swallows the '#' or '?' character at the end. This is clearly a problem with 316 # Semantic Web URI-s 317 return join(self.base, val) 318 # end _URI 319 320 def _CURIEorURI(self, val): 321 """Returns a URI for a (safe or not safe) CURIE. In case it is a safe CURIE but the CURIE itself 322 is not defined, an error message is issued. Otherwise, if it is not a CURIE, it is taken to be a URI 323 @param val: attribute value to be interpreted 324 @type val: string 325 @return: an RDFLib URIRef instance or None 326 """ 327 if val == "": 328 return URIRef(self.base) 329 330 safe_curie = False 331 if val[0] == '[': 332 # If a safe CURIE is asked for, a pure URI is not acceptable. 333 # Is checked below, and that is why the safe_curie flag is necessary 334 if val[-1] != ']': 335 # that is certainly forbidden: an incomplete safe CURIE 336 self.options.add_warning(err_illegal_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName) 337 return None 338 else: 339 val = val[1:-1] 340 safe_curie = True 341 # There is a branch here depending on whether we are in 1.1 or 1.0 mode 342 if self.rdfa_version >= "1.1": 343 retval = self.term_or_curie.CURIE_to_URI(val) 344 if retval == None: 345 # the value could not be interpreted as a CURIE, ie, it did not produce any valid URI. 346 # The rule says that then the whole value should be considered as a URI 347 # except if it was part of a safe CURIE. In that case it should be ignored... 348 if safe_curie: 349 self.options.add_warning(err_no_CURIE_in_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName) 350 return None 351 else: 352 return self._URI(val) 353 else: 354 # there is an unlikely case where the retval is actually a URIRef with a relative URI. Better filter that one out 355 if isinstance(retval, BNode) == False and urlsplit(str(retval))[0] == "": 356 # yep, there is something wrong, a new URIRef has to be created: 357 return URIRef(self.base+str(retval)) 358 else: 359 return retval 360 else: 361 # in 1.0 mode a CURIE can be considered only in case of a safe CURIE 362 if safe_curie: 363 return self.term_or_curie.CURIE_to_URI(val) 364 else: 365 return self._URI(val) 366 # end _CURIEorURI 367 368 def _TERMorCURIEorAbsURI(self, val): 369 """Returns a URI either for a term or for a CURIE. The value must be an NCNAME to be handled as a term; otherwise 370 the method falls back on a CURIE or an absolute URI. 371 @param val: attribute value to be interpreted 372 @type val: string 373 @return: an RDFLib URIRef instance or None 374 """ 375 from . import uri_schemes 376 # This case excludes the pure base, ie, the empty value 377 if val == "": 378 return None 379 380 from .termorcurie import termname 381 if termname.match(val): 382 # This is a term, must be handled as such... 383 retval = self.term_or_curie.term_to_URI(val) 384 if not retval: 385 self.options.add_warning(err_undefined_terms % val, UnresolvableTerm, node=self.node.nodeName, buggy_value = val) 386 return None 387 else: 388 return retval 389 else: 390 # try a CURIE 391 retval = self.term_or_curie.CURIE_to_URI(val) 392 if retval: 393 return retval 394 elif self.rdfa_version >= "1.1": 395 # See if it is an absolute URI 396 scheme = urlsplit(val)[0] 397 if scheme == "": 398 # bug; there should be no relative URIs here 399 self.options.add_warning(err_non_legal_CURIE_ref % val, UnresolvablePrefix, node=self.node.nodeName) 400 return None 401 else: 402 if scheme not in uri_schemes: 403 self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName) 404 return URIRef(val) 405 else: 406 # rdfa 1.0 case 407 self.options.add_warning(err_undefined_CURIE % val.strip(), UnresolvablePrefix, node=self.node.nodeName) 408 return None 409 # end _TERMorCURIEorAbsURI 410 411 # ----------------------------------------------------------------------------------------------- 412 413 def getURI(self, attr): 414 """Get the URI(s) for the attribute. The name of the attribute determines whether the value should be 415 a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done 416 using the L{ExecutionContext._resource_type} table. 417 @param attr: attribute name 418 @type attr: string 419 @return: an RDFLib URIRef instance (or None) or a list of those 420 """ 421 if self.node.hasAttribute(attr): 422 val = self.node.getAttribute(attr) 423 else: 424 if attr in ExecutionContext._list: 425 return [] 426 else: 427 return None 428 429 # This may raise an exception if the attr has no key. This, actually, 430 # should not happen if the code is correct, but it does not harm having it here... 431 try: 432 func = ExecutionContext._resource_type[attr] 433 except: 434 # Actually, this should not happen... 435 func = ExecutionContext._URI 436 437 if attr in ExecutionContext._list: 438 # Allows for a list 439 resources = [ func(self, v.strip()) for v in val.strip().split() if v != None ] 440 retval = [ r for r in resources if r != None ] 441 else: 442 retval = func(self, val.strip()) 443 return retval 444 # end getURI 445 446 def getResource(self, *args): 447 """Get single resources from several different attributes. The first one that returns a valid URI wins. 448 @param args: variable list of attribute names, or a single attribute being a list itself. 449 @return: an RDFLib URIRef instance (or None): 450 """ 451 if len(args) == 0: 452 return None 453 if isinstance(args[0], tuple) or isinstance(args[0], list): 454 rargs = args[0] 455 else: 456 rargs = args 457 458 for resource in rargs: 459 uri = self.getURI(resource) 460 if uri != None : return uri 461 return None 462 463 # ----------------------------------------------------------------------------------------------- 464 def reset_list_mapping(self, origin=None): 465 """ 466 Reset, ie, create a new empty dictionary for the list mapping. 467 """ 468 self.list_mapping = ListStructure() 469 if origin: self.set_list_origin(origin) 470 self.new_list = True 471 472 def list_empty(self): 473 """ 474 Checks whether the list is empty. 475 @return: Boolean 476 """ 477 return len(self.list_mapping.mapping) == 0 478 479 def get_list_props(self): 480 """ 481 Return the list of property values in the list structure 482 @return: list of URIRef 483 """ 484 return list(self.list_mapping.mapping.keys()) 485 486 def get_list_value(self,prop): 487 """ 488 Return the list of values in the list structure for a specific property 489 @return: list of RDF nodes 490 """ 491 return self.list_mapping.mapping[prop] 492 493 def set_list_origin(self, origin): 494 """ 495 Set the origin of the list, ie, the subject to attach the final list(s) to 496 @param origin: URIRef 497 """ 498 self.list_mapping.origin = origin 499 500 def get_list_origin(self): 501 """ 502 Return the origin of the list, ie, the subject to attach the final list(s) to 503 @return: URIRef 504 """ 505 return self.list_mapping.origin 506 507 def add_to_list_mapping(self, prop, resource): 508 """Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays; 509 if the array does not exist yet, it will be created on the fly. 510 511 @param prop: the property URI, used as a key in the dictionary 512 @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy 513 placeholder for C{<span rel="property" inlist>...</span>} constructions that may be filled in by children or siblings; if not 514 an empty list has to be generated. 515 """ 516 if prop in self.list_mapping.mapping: 517 if resource != None: 518 # indeed, if it is None, than it should not override anything 519 if self.list_mapping.mapping[prop] == None: 520 # replacing a dummy with real content 521 self.list_mapping.mapping[prop] = [ resource ] 522 else : 523 self.list_mapping.mapping[prop].append(resource) 524 else: 525 if resource != None: 526 self.list_mapping.mapping[prop] = [ resource ] 527 else: 528 self.list_mapping.mapping[prop] = None 529 530####################
42class ListStructure: 43 """Special class to handle the C{@inlist} type structures in RDFa 1.1; stores the "origin", i.e, 44 where the list will be attached to, and the mappings as defined in the spec. 45 """ 46 def __init__(self): 47 self.mapping = {} 48 self.origin = None
Special class to handle the C{@inlist} type structures in RDFa 1.1; stores the "origin", i.e, where the list will be attached to, and the mappings as defined in the spec.
51class ExecutionContext: 52 """State at a specific node, including the current set of namespaces in the RDFLib sense, current language, 53 the base, vocabularies, etc. The class is also used to interpret URI-s and CURIE-s to produce 54 URI references for RDFLib. 55 56 @ivar options: reference to the overall options 57 @type options: L{Options} 58 @ivar base: the 'base' URI 59 @ivar parsedBase: the parsed version of base, as produced by urlparse.urlsplit 60 @ivar defaultNS: default namespace (if defined via @xmlns) to be used for XML Literals 61 @ivar lang: language tag (possibly None) 62 @ivar term_or_curie: vocabulary management class instance 63 @type term_or_curie: L{termorcurie.TermOrCurie} 64 @ivar list_mapping: dictionary of arrays, containing a list of URIs key-ed via properties for lists 65 @ivar node: the node to which this state belongs 66 @type node: DOM node instance 67 @ivar rdfa_version: RDFa version of the content 68 @type rdfa_version: String 69 @ivar supress_lang: in some cases, the effect of the lang attribute should be supressed for the given node, although it should be inherited down below (example: @value attribute of the data element in HTML5) 70 @type supress_lang: Boolean 71 @cvar _list: list of attributes that allow for lists of values and should be treated as such 72 @cvar _resource_type: dictionary; mapping table from attribute name to the exact method to retrieve the URI(s). Is initialized at first instantiation. 73 """ 74 75 # list of attributes that allow for lists of values and should be treated as such 76 _list = [ "rel", "rev", "property", "typeof", "role" ] 77 # mapping table from attribute name to the exact method to retrieve the URI(s). 78 _resource_type = {} 79 80 def __init__(self, node, graph, inherited_state=None, base="", options=None, rdfa_version=None): 81 """ 82 @param node: the current DOM Node 83 @param graph: the RDFLib Graph 84 @keyword inherited_state: the state as inherited 85 from upper layers. This inherited_state is mixed with the state information 86 retrieved from the current node. 87 @type inherited_state: L{state.ExecutionContext} 88 @keyword base: string denoting the base URI for the specific node. This overrides the possible 89 base inherited from the upper layers. The 90 current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is 91 necessary for SVG (and other possible XML dialects that accept C{@xml:base}) 92 @keyword options: invocation options, and references to warning graphs 93 @type options: L{Options<pyRdfa.options>} 94 """ 95 def remove_frag_id(uri): 96 """ 97 The fragment ID for self.base must be removed 98 """ 99 try: 100 # To be on the safe side:-) 101 t = urlparse(uri) 102 return urlunparse((t[0],t[1],t[2],t[3],t[4],"")) 103 except: 104 return uri 105 106 # This is, conceptually, an additional class initialization, but it must be done run time, otherwise import errors show up 107 if len(ExecutionContext._resource_type) == 0 : 108 ExecutionContext._resource_type = { 109 "href" : ExecutionContext._URI, 110 "src" : ExecutionContext._URI, 111 "vocab" : ExecutionContext._URI, 112 113 "about" : ExecutionContext._CURIEorURI, 114 "resource" : ExecutionContext._CURIEorURI, 115 116 "rel" : ExecutionContext._TERMorCURIEorAbsURI, 117 "rev" : ExecutionContext._TERMorCURIEorAbsURI, 118 "datatype" : ExecutionContext._TERMorCURIEorAbsURI, 119 "typeof" : ExecutionContext._TERMorCURIEorAbsURI, 120 "property" : ExecutionContext._TERMorCURIEorAbsURI, 121 "role" : ExecutionContext._TERMorCURIEorAbsURI, 122 } 123 #----------------------------------------------------------------- 124 self.node = node 125 126 #----------------------------------------------------------------- 127 # Settling the base. In a generic XML, xml:base should be accepted at all levels (though this is not the 128 # case in, say, XHTML...) 129 # At the moment, it is invoked with a 'None' at the top level of parsing, that is 130 # when the <base> element is looked for (for the HTML cases, that is) 131 if inherited_state: 132 self.rdfa_version = inherited_state.rdfa_version 133 self.base = inherited_state.base 134 self.options = inherited_state.options 135 136 self.list_mapping = inherited_state.list_mapping 137 self.new_list = False 138 139 # for generic XML versions the xml:base attribute should be handled 140 if self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"): 141 self.base = remove_frag_id(node.getAttribute("xml:base")) 142 else: 143 # this is the branch called from the very top 144 self.list_mapping = ListStructure() 145 self.new_list = True 146 147 if rdfa_version is not None: 148 self.rdfa_version = rdfa_version 149 else: 150 from . import rdfa_current_version 151 self.rdfa_version = rdfa_current_version 152 153 # This value can be overwritten by a @version attribute 154 if node.hasAttribute("version"): 155 top_version = node.getAttribute("version") 156 if top_version.find("RDFa 1.0") != -1 or top_version.find("RDFa1.0") != -1: 157 self.rdfa_version = "1.0" 158 elif top_version.find("RDFa 1.1") != -1 or top_version.find("RDFa1.1") != -1: 159 self.rdfa_version = "1.1" 160 161 # this is just to play safe. I believe this should actually not happen... 162 if options == None: 163 from . import Options 164 self.options = Options() 165 else: 166 self.options = options 167 168 self.base = "" 169 # handle the base element case for HTML 170 if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.html5, HostLanguage.xhtml5 ]: 171 for bases in node.getElementsByTagName("base"): 172 if bases.hasAttribute("href"): 173 self.base = remove_frag_id(bases.getAttribute("href")) 174 continue 175 elif self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"): 176 self.base = remove_frag_id(node.getAttribute("xml:base")) 177 178 # If no local setting for base occurs, the input argument has it 179 if self.base == "": 180 self.base = base 181 182 # Perform an extra beautification in RDFLib 183 if self.options.host_language in beautifying_prefixes: 184 values = beautifying_prefixes[self.options.host_language] 185 for key in values: 186 graph.bind(key, values[key]) 187 188 input_info = "Input Host Language:%s, RDFa version:%s, base:%s" % (self.options.host_language, self.rdfa_version, self.base) 189 self.options.add_info(input_info) 190 191 #----------------------------------------------------------------- 192 # this will be used repeatedly, better store it once and for all... 193 self.parsedBase = urlsplit(self.base) 194 195 #----------------------------------------------------------------- 196 # generate and store the local CURIE handling class instance 197 self.term_or_curie = TermOrCurie(self, graph, inherited_state) 198 199 #----------------------------------------------------------------- 200 # Settling the language tags 201 # @lang has priority over @xml:lang 202 # it is a bit messy: the three fundamental modes (xhtml, html, or xml) are all slightly different:-( 203 # first get the inherited state's language, if any 204 if inherited_state: 205 self.lang = inherited_state.lang 206 else: 207 self.lang = None 208 209 self.supress_lang = False 210 211 212 if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.xhtml5, HostLanguage.html5 ]: 213 # we may have lang and xml:lang 214 if node.hasAttribute("lang"): 215 lang = node.getAttribute("lang").lower() 216 else: 217 lang = None 218 if node.hasAttribute("xml:lang"): 219 xmllang = node.getAttribute("xml:lang").lower() 220 else: 221 xmllang = None 222 # First of all, set the value, if any 223 if xmllang != None: 224 # this has priority 225 if len(xmllang) != 0: 226 self.lang = xmllang 227 else: 228 self.lang = None 229 elif lang != None: 230 if len(lang) != 0: 231 self.lang = lang 232 else: 233 self.lang = None 234 # Ideally, a warning should be generated if lang and xmllang are both present with different values. But 235 # the HTML5 Parser does its magic by overriding a lang value if xmllang is present, so the potential 236 # error situations are simply swallowed... 237 238 elif self.options.host_language in accept_xml_lang and node.hasAttribute("xml:lang"): 239 self.lang = node.getAttribute("xml:lang").lower() 240 if len(self.lang) == 0: 241 self.lang = None 242 243 #----------------------------------------------------------------- 244 # Set the default namespace. Used when generating XML Literals 245 if node.hasAttribute("xmlns"): 246 self.defaultNS = node.getAttribute("xmlns") 247 elif inherited_state and inherited_state.defaultNS != None: 248 self.defaultNS = inherited_state.defaultNS 249 else: 250 self.defaultNS = None 251 # end __init__ 252 253 def _URI(self, val): 254 """Returns a URI for a 'pure' URI (ie, not a CURIE). The method resolves possible relative URI-s. It also 255 checks whether the URI uses an unusual URI scheme (and issues a warning); this may be the result of an 256 uninterpreted CURIE... 257 @param val: attribute value to be interpreted 258 @type val: string 259 @return: an RDFLib URIRef instance 260 """ 261 def create_URIRef(uri, check=True): 262 """ 263 Mini helping function: it checks whether a uri is using a usual scheme before a URIRef is created. In case 264 there is something unusual, a warning is generated (though the URIRef is created nevertheless) 265 @param uri: (absolute) URI string 266 @return: an RDFLib URIRef instance 267 """ 268 from . import uri_schemes 269 val = uri.strip() 270 if check and urlsplit(val)[0] not in uri_schemes: 271 self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName) 272 return URIRef(val) 273 274 def join(base, v, check=True): 275 """ 276 Mini helping function: it makes a urljoin for the paths. Based on the python library, but 277 that one has a bug: in some cases it 278 swallows the '#' or '?' character at the end. This is clearly a problem with 279 Semantic Web URI-s, so this is checked, too 280 @param base: base URI string 281 @param v: local part 282 @param check: whether the URI should be checked against the list of 'existing' URI schemes 283 @return: an RDFLib URIRef instance 284 """ 285 286 joined = urljoin(base, v) 287 try: 288 if v[-1] != joined[-1] and (v[-1] == "#" or v[-1] == "?"): 289 return create_URIRef(joined + v[-1], check) 290 else: 291 return create_URIRef(joined, check) 292 except: 293 return create_URIRef(joined, check) 294 295 if val == "": 296 # The fragment ID must be removed... 297 return URIRef(self.base) 298 299 # fall back on good old traditional URI-s. 300 # To be on the safe side, let us use the Python libraries 301 if self.parsedBase[0] == "": 302 # base is, in fact, a local file name 303 # The following call is just to be sure that some pathological cases when 304 # the ':' _does_ appear in the URI but not in a scheme position is taken 305 # care of properly... 306 307 key = urlsplit(val)[0] 308 if key == "": 309 # relative URI, to be combined with local file name: 310 return join(self.base, val, check = False) 311 else: 312 return create_URIRef(val) 313 else: 314 # Trust the python library... 315 # Well, not quite:-) there is what is, in my view, a bug in the urljoin; in some cases it 316 # swallows the '#' or '?' character at the end. This is clearly a problem with 317 # Semantic Web URI-s 318 return join(self.base, val) 319 # end _URI 320 321 def _CURIEorURI(self, val): 322 """Returns a URI for a (safe or not safe) CURIE. In case it is a safe CURIE but the CURIE itself 323 is not defined, an error message is issued. Otherwise, if it is not a CURIE, it is taken to be a URI 324 @param val: attribute value to be interpreted 325 @type val: string 326 @return: an RDFLib URIRef instance or None 327 """ 328 if val == "": 329 return URIRef(self.base) 330 331 safe_curie = False 332 if val[0] == '[': 333 # If a safe CURIE is asked for, a pure URI is not acceptable. 334 # Is checked below, and that is why the safe_curie flag is necessary 335 if val[-1] != ']': 336 # that is certainly forbidden: an incomplete safe CURIE 337 self.options.add_warning(err_illegal_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName) 338 return None 339 else: 340 val = val[1:-1] 341 safe_curie = True 342 # There is a branch here depending on whether we are in 1.1 or 1.0 mode 343 if self.rdfa_version >= "1.1": 344 retval = self.term_or_curie.CURIE_to_URI(val) 345 if retval == None: 346 # the value could not be interpreted as a CURIE, ie, it did not produce any valid URI. 347 # The rule says that then the whole value should be considered as a URI 348 # except if it was part of a safe CURIE. In that case it should be ignored... 349 if safe_curie: 350 self.options.add_warning(err_no_CURIE_in_safe_CURIE % val, UnresolvablePrefix, node=self.node.nodeName) 351 return None 352 else: 353 return self._URI(val) 354 else: 355 # there is an unlikely case where the retval is actually a URIRef with a relative URI. Better filter that one out 356 if isinstance(retval, BNode) == False and urlsplit(str(retval))[0] == "": 357 # yep, there is something wrong, a new URIRef has to be created: 358 return URIRef(self.base+str(retval)) 359 else: 360 return retval 361 else: 362 # in 1.0 mode a CURIE can be considered only in case of a safe CURIE 363 if safe_curie: 364 return self.term_or_curie.CURIE_to_URI(val) 365 else: 366 return self._URI(val) 367 # end _CURIEorURI 368 369 def _TERMorCURIEorAbsURI(self, val): 370 """Returns a URI either for a term or for a CURIE. The value must be an NCNAME to be handled as a term; otherwise 371 the method falls back on a CURIE or an absolute URI. 372 @param val: attribute value to be interpreted 373 @type val: string 374 @return: an RDFLib URIRef instance or None 375 """ 376 from . import uri_schemes 377 # This case excludes the pure base, ie, the empty value 378 if val == "": 379 return None 380 381 from .termorcurie import termname 382 if termname.match(val): 383 # This is a term, must be handled as such... 384 retval = self.term_or_curie.term_to_URI(val) 385 if not retval: 386 self.options.add_warning(err_undefined_terms % val, UnresolvableTerm, node=self.node.nodeName, buggy_value = val) 387 return None 388 else: 389 return retval 390 else: 391 # try a CURIE 392 retval = self.term_or_curie.CURIE_to_URI(val) 393 if retval: 394 return retval 395 elif self.rdfa_version >= "1.1": 396 # See if it is an absolute URI 397 scheme = urlsplit(val)[0] 398 if scheme == "": 399 # bug; there should be no relative URIs here 400 self.options.add_warning(err_non_legal_CURIE_ref % val, UnresolvablePrefix, node=self.node.nodeName) 401 return None 402 else: 403 if scheme not in uri_schemes: 404 self.options.add_warning(err_URI_scheme % val.strip(), node=self.node.nodeName) 405 return URIRef(val) 406 else: 407 # rdfa 1.0 case 408 self.options.add_warning(err_undefined_CURIE % val.strip(), UnresolvablePrefix, node=self.node.nodeName) 409 return None 410 # end _TERMorCURIEorAbsURI 411 412 # ----------------------------------------------------------------------------------------------- 413 414 def getURI(self, attr): 415 """Get the URI(s) for the attribute. The name of the attribute determines whether the value should be 416 a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done 417 using the L{ExecutionContext._resource_type} table. 418 @param attr: attribute name 419 @type attr: string 420 @return: an RDFLib URIRef instance (or None) or a list of those 421 """ 422 if self.node.hasAttribute(attr): 423 val = self.node.getAttribute(attr) 424 else: 425 if attr in ExecutionContext._list: 426 return [] 427 else: 428 return None 429 430 # This may raise an exception if the attr has no key. This, actually, 431 # should not happen if the code is correct, but it does not harm having it here... 432 try: 433 func = ExecutionContext._resource_type[attr] 434 except: 435 # Actually, this should not happen... 436 func = ExecutionContext._URI 437 438 if attr in ExecutionContext._list: 439 # Allows for a list 440 resources = [ func(self, v.strip()) for v in val.strip().split() if v != None ] 441 retval = [ r for r in resources if r != None ] 442 else: 443 retval = func(self, val.strip()) 444 return retval 445 # end getURI 446 447 def getResource(self, *args): 448 """Get single resources from several different attributes. The first one that returns a valid URI wins. 449 @param args: variable list of attribute names, or a single attribute being a list itself. 450 @return: an RDFLib URIRef instance (or None): 451 """ 452 if len(args) == 0: 453 return None 454 if isinstance(args[0], tuple) or isinstance(args[0], list): 455 rargs = args[0] 456 else: 457 rargs = args 458 459 for resource in rargs: 460 uri = self.getURI(resource) 461 if uri != None : return uri 462 return None 463 464 # ----------------------------------------------------------------------------------------------- 465 def reset_list_mapping(self, origin=None): 466 """ 467 Reset, ie, create a new empty dictionary for the list mapping. 468 """ 469 self.list_mapping = ListStructure() 470 if origin: self.set_list_origin(origin) 471 self.new_list = True 472 473 def list_empty(self): 474 """ 475 Checks whether the list is empty. 476 @return: Boolean 477 """ 478 return len(self.list_mapping.mapping) == 0 479 480 def get_list_props(self): 481 """ 482 Return the list of property values in the list structure 483 @return: list of URIRef 484 """ 485 return list(self.list_mapping.mapping.keys()) 486 487 def get_list_value(self,prop): 488 """ 489 Return the list of values in the list structure for a specific property 490 @return: list of RDF nodes 491 """ 492 return self.list_mapping.mapping[prop] 493 494 def set_list_origin(self, origin): 495 """ 496 Set the origin of the list, ie, the subject to attach the final list(s) to 497 @param origin: URIRef 498 """ 499 self.list_mapping.origin = origin 500 501 def get_list_origin(self): 502 """ 503 Return the origin of the list, ie, the subject to attach the final list(s) to 504 @return: URIRef 505 """ 506 return self.list_mapping.origin 507 508 def add_to_list_mapping(self, prop, resource): 509 """Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays; 510 if the array does not exist yet, it will be created on the fly. 511 512 @param prop: the property URI, used as a key in the dictionary 513 @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy 514 placeholder for C{<span rel="property" inlist>...</span>} constructions that may be filled in by children or siblings; if not 515 an empty list has to be generated. 516 """ 517 if prop in self.list_mapping.mapping: 518 if resource != None: 519 # indeed, if it is None, than it should not override anything 520 if self.list_mapping.mapping[prop] == None: 521 # replacing a dummy with real content 522 self.list_mapping.mapping[prop] = [ resource ] 523 else : 524 self.list_mapping.mapping[prop].append(resource) 525 else: 526 if resource != None: 527 self.list_mapping.mapping[prop] = [ resource ] 528 else: 529 self.list_mapping.mapping[prop] = None
State at a specific node, including the current set of namespaces in the RDFLib sense, current language, the base, vocabularies, etc. The class is also used to interpret URI-s and CURIE-s to produce URI references for RDFLib.
@ivar options: reference to the overall options @type options: L{Options} @ivar base: the 'base' URI @ivar parsedBase: the parsed version of base, as produced by urlparse.urlsplit @ivar defaultNS: default namespace (if defined via @xmlns) to be used for XML Literals @ivar lang: language tag (possibly None) @ivar term_or_curie: vocabulary management class instance @type term_or_curie: L{termorcurie.TermOrCurie} @ivar list_mapping: dictionary of arrays, containing a list of URIs key-ed via properties for lists @ivar node: the node to which this state belongs @type node: DOM node instance @ivar rdfa_version: RDFa version of the content @type rdfa_version: String @ivar supress_lang: in some cases, the effect of the lang attribute should be supressed for the given node, although it should be inherited down below (example: @value attribute of the data element in HTML5) @type supress_lang: Boolean @cvar _list: list of attributes that allow for lists of values and should be treated as such @cvar _resource_type: dictionary; mapping table from attribute name to the exact method to retrieve the URI(s). Is initialized at first instantiation.
80 def __init__(self, node, graph, inherited_state=None, base="", options=None, rdfa_version=None): 81 """ 82 @param node: the current DOM Node 83 @param graph: the RDFLib Graph 84 @keyword inherited_state: the state as inherited 85 from upper layers. This inherited_state is mixed with the state information 86 retrieved from the current node. 87 @type inherited_state: L{state.ExecutionContext} 88 @keyword base: string denoting the base URI for the specific node. This overrides the possible 89 base inherited from the upper layers. The 90 current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is 91 necessary for SVG (and other possible XML dialects that accept C{@xml:base}) 92 @keyword options: invocation options, and references to warning graphs 93 @type options: L{Options<pyRdfa.options>} 94 """ 95 def remove_frag_id(uri): 96 """ 97 The fragment ID for self.base must be removed 98 """ 99 try: 100 # To be on the safe side:-) 101 t = urlparse(uri) 102 return urlunparse((t[0],t[1],t[2],t[3],t[4],"")) 103 except: 104 return uri 105 106 # This is, conceptually, an additional class initialization, but it must be done run time, otherwise import errors show up 107 if len(ExecutionContext._resource_type) == 0 : 108 ExecutionContext._resource_type = { 109 "href" : ExecutionContext._URI, 110 "src" : ExecutionContext._URI, 111 "vocab" : ExecutionContext._URI, 112 113 "about" : ExecutionContext._CURIEorURI, 114 "resource" : ExecutionContext._CURIEorURI, 115 116 "rel" : ExecutionContext._TERMorCURIEorAbsURI, 117 "rev" : ExecutionContext._TERMorCURIEorAbsURI, 118 "datatype" : ExecutionContext._TERMorCURIEorAbsURI, 119 "typeof" : ExecutionContext._TERMorCURIEorAbsURI, 120 "property" : ExecutionContext._TERMorCURIEorAbsURI, 121 "role" : ExecutionContext._TERMorCURIEorAbsURI, 122 } 123 #----------------------------------------------------------------- 124 self.node = node 125 126 #----------------------------------------------------------------- 127 # Settling the base. In a generic XML, xml:base should be accepted at all levels (though this is not the 128 # case in, say, XHTML...) 129 # At the moment, it is invoked with a 'None' at the top level of parsing, that is 130 # when the <base> element is looked for (for the HTML cases, that is) 131 if inherited_state: 132 self.rdfa_version = inherited_state.rdfa_version 133 self.base = inherited_state.base 134 self.options = inherited_state.options 135 136 self.list_mapping = inherited_state.list_mapping 137 self.new_list = False 138 139 # for generic XML versions the xml:base attribute should be handled 140 if self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"): 141 self.base = remove_frag_id(node.getAttribute("xml:base")) 142 else: 143 # this is the branch called from the very top 144 self.list_mapping = ListStructure() 145 self.new_list = True 146 147 if rdfa_version is not None: 148 self.rdfa_version = rdfa_version 149 else: 150 from . import rdfa_current_version 151 self.rdfa_version = rdfa_current_version 152 153 # This value can be overwritten by a @version attribute 154 if node.hasAttribute("version"): 155 top_version = node.getAttribute("version") 156 if top_version.find("RDFa 1.0") != -1 or top_version.find("RDFa1.0") != -1: 157 self.rdfa_version = "1.0" 158 elif top_version.find("RDFa 1.1") != -1 or top_version.find("RDFa1.1") != -1: 159 self.rdfa_version = "1.1" 160 161 # this is just to play safe. I believe this should actually not happen... 162 if options == None: 163 from . import Options 164 self.options = Options() 165 else: 166 self.options = options 167 168 self.base = "" 169 # handle the base element case for HTML 170 if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.html5, HostLanguage.xhtml5 ]: 171 for bases in node.getElementsByTagName("base"): 172 if bases.hasAttribute("href"): 173 self.base = remove_frag_id(bases.getAttribute("href")) 174 continue 175 elif self.options.host_language in accept_xml_base and node.hasAttribute("xml:base"): 176 self.base = remove_frag_id(node.getAttribute("xml:base")) 177 178 # If no local setting for base occurs, the input argument has it 179 if self.base == "": 180 self.base = base 181 182 # Perform an extra beautification in RDFLib 183 if self.options.host_language in beautifying_prefixes: 184 values = beautifying_prefixes[self.options.host_language] 185 for key in values: 186 graph.bind(key, values[key]) 187 188 input_info = "Input Host Language:%s, RDFa version:%s, base:%s" % (self.options.host_language, self.rdfa_version, self.base) 189 self.options.add_info(input_info) 190 191 #----------------------------------------------------------------- 192 # this will be used repeatedly, better store it once and for all... 193 self.parsedBase = urlsplit(self.base) 194 195 #----------------------------------------------------------------- 196 # generate and store the local CURIE handling class instance 197 self.term_or_curie = TermOrCurie(self, graph, inherited_state) 198 199 #----------------------------------------------------------------- 200 # Settling the language tags 201 # @lang has priority over @xml:lang 202 # it is a bit messy: the three fundamental modes (xhtml, html, or xml) are all slightly different:-( 203 # first get the inherited state's language, if any 204 if inherited_state: 205 self.lang = inherited_state.lang 206 else: 207 self.lang = None 208 209 self.supress_lang = False 210 211 212 if self.options.host_language in [ HostLanguage.xhtml, HostLanguage.xhtml5, HostLanguage.html5 ]: 213 # we may have lang and xml:lang 214 if node.hasAttribute("lang"): 215 lang = node.getAttribute("lang").lower() 216 else: 217 lang = None 218 if node.hasAttribute("xml:lang"): 219 xmllang = node.getAttribute("xml:lang").lower() 220 else: 221 xmllang = None 222 # First of all, set the value, if any 223 if xmllang != None: 224 # this has priority 225 if len(xmllang) != 0: 226 self.lang = xmllang 227 else: 228 self.lang = None 229 elif lang != None: 230 if len(lang) != 0: 231 self.lang = lang 232 else: 233 self.lang = None 234 # Ideally, a warning should be generated if lang and xmllang are both present with different values. But 235 # the HTML5 Parser does its magic by overriding a lang value if xmllang is present, so the potential 236 # error situations are simply swallowed... 237 238 elif self.options.host_language in accept_xml_lang and node.hasAttribute("xml:lang"): 239 self.lang = node.getAttribute("xml:lang").lower() 240 if len(self.lang) == 0: 241 self.lang = None 242 243 #----------------------------------------------------------------- 244 # Set the default namespace. Used when generating XML Literals 245 if node.hasAttribute("xmlns"): 246 self.defaultNS = node.getAttribute("xmlns") 247 elif inherited_state and inherited_state.defaultNS != None: 248 self.defaultNS = inherited_state.defaultNS 249 else: 250 self.defaultNS = None
@param node: the current DOM Node @param graph: the RDFLib Graph @keyword inherited_state: the state as inherited from upper layers. This inherited_state is mixed with the state information retrieved from the current node. @type inherited_state: L{state.ExecutionContext} @keyword base: string denoting the base URI for the specific node. This overrides the possible base inherited from the upper layers. The current XHTML+RDFa syntax does not allow the usage of C{@xml:base}, but SVG1.2 does, so this is necessary for SVG (and other possible XML dialects that accept C{@xml:base}) @keyword options: invocation options, and references to warning graphs @type options: L{Options<pyRdfa.options>}
414 def getURI(self, attr): 415 """Get the URI(s) for the attribute. The name of the attribute determines whether the value should be 416 a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done 417 using the L{ExecutionContext._resource_type} table. 418 @param attr: attribute name 419 @type attr: string 420 @return: an RDFLib URIRef instance (or None) or a list of those 421 """ 422 if self.node.hasAttribute(attr): 423 val = self.node.getAttribute(attr) 424 else: 425 if attr in ExecutionContext._list: 426 return [] 427 else: 428 return None 429 430 # This may raise an exception if the attr has no key. This, actually, 431 # should not happen if the code is correct, but it does not harm having it here... 432 try: 433 func = ExecutionContext._resource_type[attr] 434 except: 435 # Actually, this should not happen... 436 func = ExecutionContext._URI 437 438 if attr in ExecutionContext._list: 439 # Allows for a list 440 resources = [ func(self, v.strip()) for v in val.strip().split() if v != None ] 441 retval = [ r for r in resources if r != None ] 442 else: 443 retval = func(self, val.strip()) 444 return retval
Get the URI(s) for the attribute. The name of the attribute determines whether the value should be a pure URI, a CURIE, etc, and whether the return is a single element of a list of those. This is done using the L{ExecutionContext._resource_type} table. @param attr: attribute name @type attr: string @return: an RDFLib URIRef instance (or None) or a list of those
447 def getResource(self, *args): 448 """Get single resources from several different attributes. The first one that returns a valid URI wins. 449 @param args: variable list of attribute names, or a single attribute being a list itself. 450 @return: an RDFLib URIRef instance (or None): 451 """ 452 if len(args) == 0: 453 return None 454 if isinstance(args[0], tuple) or isinstance(args[0], list): 455 rargs = args[0] 456 else: 457 rargs = args 458 459 for resource in rargs: 460 uri = self.getURI(resource) 461 if uri != None : return uri 462 return None
Get single resources from several different attributes. The first one that returns a valid URI wins. @param args: variable list of attribute names, or a single attribute being a list itself. @return: an RDFLib URIRef instance (or None):
465 def reset_list_mapping(self, origin=None): 466 """ 467 Reset, ie, create a new empty dictionary for the list mapping. 468 """ 469 self.list_mapping = ListStructure() 470 if origin: self.set_list_origin(origin) 471 self.new_list = True
Reset, ie, create a new empty dictionary for the list mapping.
473 def list_empty(self): 474 """ 475 Checks whether the list is empty. 476 @return: Boolean 477 """ 478 return len(self.list_mapping.mapping) == 0
Checks whether the list is empty. @return: Boolean
480 def get_list_props(self): 481 """ 482 Return the list of property values in the list structure 483 @return: list of URIRef 484 """ 485 return list(self.list_mapping.mapping.keys())
Return the list of property values in the list structure @return: list of URIRef
487 def get_list_value(self,prop): 488 """ 489 Return the list of values in the list structure for a specific property 490 @return: list of RDF nodes 491 """ 492 return self.list_mapping.mapping[prop]
Return the list of values in the list structure for a specific property @return: list of RDF nodes
494 def set_list_origin(self, origin): 495 """ 496 Set the origin of the list, ie, the subject to attach the final list(s) to 497 @param origin: URIRef 498 """ 499 self.list_mapping.origin = origin
Set the origin of the list, ie, the subject to attach the final list(s) to @param origin: URIRef
501 def get_list_origin(self): 502 """ 503 Return the origin of the list, ie, the subject to attach the final list(s) to 504 @return: URIRef 505 """ 506 return self.list_mapping.origin
Return the origin of the list, ie, the subject to attach the final list(s) to @return: URIRef
508 def add_to_list_mapping(self, prop, resource): 509 """Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays; 510 if the array does not exist yet, it will be created on the fly. 511 512 @param prop: the property URI, used as a key in the dictionary 513 @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy 514 placeholder for C{<span rel="property" inlist>...</span>} constructions that may be filled in by children or siblings; if not 515 an empty list has to be generated. 516 """ 517 if prop in self.list_mapping.mapping: 518 if resource != None: 519 # indeed, if it is None, than it should not override anything 520 if self.list_mapping.mapping[prop] == None: 521 # replacing a dummy with real content 522 self.list_mapping.mapping[prop] = [ resource ] 523 else : 524 self.list_mapping.mapping[prop].append(resource) 525 else: 526 if resource != None: 527 self.list_mapping.mapping[prop] = [ resource ] 528 else: 529 self.list_mapping.mapping[prop] = None
Add a new property-resource on the list mapping structure. The latter is a dictionary of arrays; if the array does not exist yet, it will be created on the fly.
@param prop: the property URI, used as a key in the dictionary @param resource: the resource to be added to the relevant array in the dictionary. Can be None; this is a dummy placeholder for C{...} constructions that may be filled in by children or siblings; if not an empty list has to be generated.