pyRdfa.termorcurie
Management of vocabularies, terms, and their mapping to URI-s. The main class of this module (L{TermOrCurie}) is, conceptually, part of the overall state of processing at a node (L{state.ExecutionContext}) but putting it into a separate module makes it easider to maintain.
@summary: Management of vocabularies, terms, and their mapping to URI-s.
@requires: U{RDFLib packagehttp://rdflib.net}
@organization: U{World Wide Web Consortiumhttp://www.w3.org}
@author: U{Ivan Herman}
@license: This software is available for use under the
U{W3C® SOFTWARE NOTICE AND LICENSE
@var XHTML_PREFIX: prefix for the XHTML vocabulary URI (set to 'xhv') @var XHTML_URI: URI prefix of the XHTML vocabulary @var ncname: Regular expression object for NCNAME @var termname: Regular expression object for a term @var xml_application_media_type: Regular expression object for a general XML application media type
1# -*- coding: utf-8 -*- 2""" 3Management of vocabularies, terms, and their mapping to URI-s. The main class of this module (L{TermOrCurie}) is, 4conceptually, part of the overall state of processing at a node (L{state.ExecutionContext}) but putting it into a separate 5module makes it easider to maintain. 6 7@summary: Management of vocabularies, terms, and their mapping to URI-s. 8@requires: U{RDFLib package<http://rdflib.net>} 9@organization: U{World Wide Web Consortium<http://www.w3.org>} 10@author: U{Ivan Herman<a href="http://www.w3.org/People/Ivan/">} 11@license: This software is available for use under the 12U{W3C® SOFTWARE NOTICE AND LICENSE<href="http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231">} 13 14@var XHTML_PREFIX: prefix for the XHTML vocabulary URI (set to 'xhv') 15@var XHTML_URI: URI prefix of the XHTML vocabulary 16@var ncname: Regular expression object for NCNAME 17@var termname: Regular expression object for a term 18@var xml_application_media_type: Regular expression object for a general XML application media type 19""" 20 21""" 22$Id: termorcurie.py,v 1.12 2013-10-16 11:48:54 ivan Exp $ 23$Date: 2013-10-16 11:48:54 $ 24""" 25 26import re 27 28from urllib.parse import urlsplit 29 30 31from rdflib import URIRef 32from rdflib import BNode 33from rdflib import Namespace 34 35from .utils import quote_URI 36from .host import predefined_1_0_rel, warn_xmlns_usage 37from . import IncorrectPrefixDefinition, RDFA_VOCAB, UnresolvableReference, PrefixRedefinitionWarning 38 39from . import err_redefining_URI_as_prefix 40from . import err_xmlns_deprecated 41from . import err_bnode_local_prefix 42from . import err_col_local_prefix 43from . import err_missing_URI_prefix 44from . import err_invalid_prefix 45from . import err_no_default_prefix 46from . import err_prefix_and_xmlns 47from . import err_non_ncname_prefix 48from . import err_absolute_reference 49from . import err_query_reference 50from . import err_fragment_reference 51from . import err_prefix_redefinition 52 53 54# Regular expression object for NCNAME 55ncname = re.compile("^[A-Za-z][A-Za-z0-9._-]*$") 56 57# Regular expression object for term name 58termname = re.compile("^[A-Za-z]([A-Za-z0-9._-]|/)*$") 59 60# Regular expression object for a general XML application media type 61xml_application_media_type = re.compile(r"application/[a-zA-Z0-9]+\+xml") 62 63XHTML_PREFIX = "xhv" 64XHTML_URI = "http://www.w3.org/1999/xhtml/vocab#" 65 66#### Managing blank nodes for CURIE-s: mapping from local names to blank nodes. 67_bnodes = {} 68_empty_bnode = BNode() 69 70#### 71 72class InitialContext: 73 """ 74 Get the initial context values. In most cases this class has an empty content, except for the 75 top level (in case of RDFa 1.1). Each L{TermOrCurie} class has one instance of this class. It provides initial 76 mappings for terms, namespace prefixes, etc, that the top level L{TermOrCurie} instance uses for its own initialization. 77 78 @ivar terms: collection of all term mappings 79 @type terms: dictionary 80 @ivar ns: namespace mapping 81 @type ns: dictionary 82 @ivar vocabulary: default vocabulary 83 @type vocabulary: string 84 """ 85 86 def __init__(self, state, top_level): 87 """ 88 @param state: the state behind this term mapping 89 @type state: L{state.ExecutionContext} 90 @param top_level : whether this is the top node of the DOM tree (the only place where initial contexts are handled) 91 @type top_level : boolean 92 """ 93 self.state = state 94 95 # This is to store the local terms 96 self.terms = {} 97 # This is to store the local Namespaces (a.k.a. prefixes) 98 self.ns = {} 99 # Default vocabulary 100 self.vocabulary = None 101 102 if state.rdfa_version < "1.1" or top_level == False: 103 return 104 105 from .initialcontext import initial_context as context_data 106 from .host import initial_contexts as context_ids 107 from .host import default_vocabulary 108 109 for i in context_ids[state.options.host_language]: 110 # This gives the id of a initial context, valid for this media type: 111 data = context_data[i] 112 113 # Merge the context data with the overall definition 114 if state.options.host_language in default_vocabulary: 115 self.vocabulary = default_vocabulary[state.options.host_language] 116 elif data.vocabulary != "": 117 self.vocabulary = data.vocabulary 118 119 for key in data.terms: 120 self.terms[key] = URIRef(data.terms[key]) 121 for key in data.ns: 122 self.ns[key] = (Namespace(data.ns[key]),False) 123 124 125################################################################################################################## 126 127class TermOrCurie: 128 """ 129 Wrapper around vocabulary management, ie, mapping a term to a URI, as well as a CURIE to a URI. Each instance of this class belongs to a 130 "state", instance of L{state.ExecutionContext}. Context definitions are managed at initialization time. 131 132 (In fact, this class is, conceptually, part of the overall state at a node, and has been separated here for an 133 easier maintenance.) 134 135 The class takes care of the stack-like behavior of vocabulary items, ie, inheriting everything that is possible 136 from the "parent". At initialization time, this works through the prefix definitions (i.e., C{@prefix} or C{@xmln:} attributes) 137 and/or C{@vocab} attributes. 138 139 @ivar state: State to which this instance belongs 140 @type state: L{state.ExecutionContext} 141 @ivar graph: The RDF Graph under generation 142 @type graph: rdflib.Graph 143 @ivar terms: mapping from terms to URI-s 144 @type terms: dictionary 145 @ivar ns: namespace declarations, ie, mapping from prefixes to URIs 146 @type ns: dictionary 147 @ivar default_curie_uri: URI for a default CURIE 148 """ 149 def __init__(self, state, graph, inherited_state): 150 """Initialize the vocab bound to a specific state. 151 @param state: the state to which this vocab instance belongs to 152 @type state: L{state.ExecutionContext} 153 @param graph: the RDF graph being worked on 154 @type graph: rdflib.Graph 155 @param inherited_state: the state inherited by the current state. 'None' if this is the top level state. 156 @type inherited_state: L{state.ExecutionContext} 157 """ 158 def check_prefix(pr): 159 from . import uri_schemes 160 if pr in uri_schemes: 161 # The prefix being defined is a registered URI scheme, better avoid it... 162 state.options.add_warning(err_redefining_URI_as_prefix % pr, node=state.node.nodeName) 163 164 self.state = state 165 self.graph = graph 166 167 # -------------------------------------------------------------------------------- 168 # This is set to non-void only on the top level and in the case of 1.1 169 default_vocab = InitialContext(self.state, inherited_state == None) 170 171 # Set the default CURIE URI 172 if inherited_state == None: 173 # This is the top level... 174 self.default_curie_uri = Namespace(XHTML_URI) 175 # self.graph.bind(XHTML_PREFIX, self.default_curie_uri) 176 else: 177 self.default_curie_uri = inherited_state.term_or_curie.default_curie_uri 178 179 # -------------------------------------------------------------------------------- 180 # Set the default term URI 181 # This is a 1.1 feature, ie, should be ignored if the version is < 1.0 182 if state.rdfa_version >= "1.1": 183 # that is the absolute default setup... 184 if inherited_state == None: 185 self.default_term_uri = None 186 else: 187 self.default_term_uri = inherited_state.term_or_curie.default_term_uri 188 189 # see if the initial context has defined a default vocabulary: 190 if default_vocab.vocabulary: 191 self.default_term_uri = default_vocab.vocabulary 192 193 # see if there is local vocab that would override previous settings 194 # However, care should be taken with the vocab="" value that should not become a URI... 195 # Indeed, this value is used to 'vipe out', ie, get back to the default vocabulary... 196 if self.state.node.hasAttribute("vocab") and self.state.node.getAttribute("vocab") == "": 197 self.default_term_uri = default_vocab.vocabulary 198 else: 199 def_term_uri = self.state.getURI("vocab") 200 if def_term_uri and def_term_uri != "" : 201 self.default_term_uri = def_term_uri 202 self.graph.add((URIRef(self.state.base),RDFA_VOCAB,URIRef(def_term_uri))) 203 else: 204 self.default_term_uri = None 205 206 # -------------------------------------------------------------------------------- 207 # The simpler case: terms, adding those that have been defined by a possible initial context 208 if inherited_state is None: 209 # this is the vocabulary belonging to the top level of the tree! 210 self.terms = {} 211 if state.rdfa_version >= "1.1": 212 # Simply get the terms defined by the default vocabularies. There is no need for merging 213 for key in default_vocab.terms: 214 self.terms[key] = default_vocab.terms[key] 215 else: 216 # The terms are hardwired... 217 for key in predefined_1_0_rel: 218 self.terms[key] = URIRef(XHTML_URI + key) 219 else: 220 # just refer to the inherited terms 221 self.terms = inherited_state.term_or_curie.terms 222 223 #----------------------------------------------------------------- 224 # the locally defined namespaces 225 ns_dict = {} 226 # locally defined xmlns namespaces, necessary for correct XML Literal generation 227 xmlns_dict = {} 228 229 # Add the locally defined namespaces using the xmlns: syntax 230 for i in range(0, state.node.attributes.length): 231 attr = state.node.attributes.item(i) 232 if attr.name.find('xmlns:') == 0 : 233 # yep, there is a namespace setting 234 prefix = attr.localName 235 if prefix != "" : # exclude the top level xmlns setting... 236 if state.rdfa_version >= "1.1" and state.options.host_language in warn_xmlns_usage: 237 state.options.add_warning(err_xmlns_deprecated % prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 238 if prefix == "_": 239 state.options.add_warning(err_bnode_local_prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 240 elif prefix.find(':') != -1: 241 state.options.add_warning(err_col_local_prefix % prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 242 else : 243 # quote the URI, ie, convert special characters into %.. This is 244 # true, for example, for spaces 245 uri = quote_URI(attr.value, state.options) 246 # create a new RDFLib Namespace entry 247 ns = Namespace(uri) 248 # Add an entry to the dictionary if not already there (priority is left to right!) 249 if state.rdfa_version >= "1.1": 250 pr = prefix.lower() 251 else: 252 pr = prefix 253 ns_dict[pr] = ns 254 xmlns_dict[pr] = ns 255 self.graph.bind(pr,ns) 256 check_prefix(pr) 257 258 # Add the locally defined namespaces using the @prefix syntax 259 # this may override the definition @xmlns 260 if state.rdfa_version >= "1.1" and state.node.hasAttribute("prefix"): 261 pr = state.node.getAttribute("prefix") 262 if pr != None: 263 # separator character is whitespace 264 pr_list = pr.strip().split() 265 # range(0, len(pr_list), 2) 266 for i in range(len(pr_list) - 2, -1, -2): 267 prefix = pr_list[i] 268 # see if there is a URI at all 269 if i == len(pr_list) - 1: 270 state.options.add_warning(err_missing_URI_prefix % (prefix,pr), node=state.node.nodeName) 271 break 272 else: 273 value = pr_list[i+1] 274 275 # see if the value of prefix is o.k., ie, there is a ':' at the end 276 if prefix[-1] != ':': 277 state.options.add_warning(err_invalid_prefix % (prefix,pr), IncorrectPrefixDefinition, node=state.node.nodeName) 278 continue 279 elif prefix == ":": 280 state.options.add_warning(err_no_default_prefix % pr, IncorrectPrefixDefinition, node=state.node.nodeName) 281 continue 282 else: 283 prefix = prefix[:-1] 284 uri = Namespace(quote_URI(value, state.options)) 285 if prefix == "": 286 #something to be done here 287 self.default_curie_uri = uri 288 elif prefix == "_": 289 state.options.add_warning(err_bnode_local_prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 290 else: 291 # last check: is the prefix an NCNAME? 292 if ncname.match(prefix): 293 real_prefix = prefix.lower() 294 ns_dict[real_prefix] = uri 295 self.graph.bind(real_prefix,uri) 296 # Additional warning: is this prefix overriding an existing xmlns statement with a different URI? if 297 # so, that may lead to discrepancies between an RDFa 1.0 and RDFa 1.1 run... 298 if (prefix in xmlns_dict and xmlns_dict[prefix] != uri) or (real_prefix in xmlns_dict and xmlns_dict[real_prefix] != uri): 299 state.options.add_warning(err_prefix_and_xmlns % (real_prefix,real_prefix), node=state.node.nodeName) 300 check_prefix(real_prefix) 301 302 else: 303 state.options.add_warning(err_non_ncname_prefix % (prefix,pr), IncorrectPrefixDefinition, node=state.node.nodeName) 304 305 # See if anything has been collected at all. 306 # If not, the namespaces of the incoming state is 307 # taken over by reference. Otherwise that is copied to the 308 # the local dictionary 309 if inherited_state == None: 310 self.default_prefixes = default_vocab.ns 311 inherited_prefixes = {} 312 else: 313 self.default_prefixes = inherited_state.term_or_curie.default_prefixes 314 inherited_prefixes = inherited_state.term_or_curie.ns 315 316 if len(ns_dict) == 0: 317 self.ns = inherited_prefixes 318 else: 319 self.ns = {} 320 for key in inherited_prefixes : self.ns[key] = inherited_prefixes[key] 321 for key in ns_dict : 322 if (key in inherited_prefixes and ns_dict[key] != inherited_prefixes[key]) or (key in self.default_prefixes and ns_dict[key] != self.default_prefixes[key][0]): 323 state.options.add_warning(err_prefix_redefinition % key, PrefixRedefinitionWarning, node=state.node.nodeName) 324 self.ns[key] = ns_dict[key] 325 326 327 # the xmlns prefixes have to be stored separately, again for XML Literal generation 328 self.xmlns = {} 329 if len(xmlns_dict) == 0 and inherited_state: 330 self.xmlns = inherited_state.term_or_curie.xmlns 331 else: 332 if inherited_state: 333 for key in inherited_state.term_or_curie.xmlns : self.xmlns[key] = inherited_state.term_or_curie.xmlns[key] 334 for key in xmlns_dict : self.xmlns[key] = xmlns_dict[key] 335 else: 336 self.xmlns = xmlns_dict 337 # end __init__ 338 339 def _check_reference(self, val): 340 """Checking the CURIE reference for correctness. It is probably not 100% foolproof, but may take care 341 of some of the possible errors. See the URI RFC for the details. 342 """ 343 def char_check(s, not_allowed = ['#','[',']']): 344 for c in not_allowed: 345 if s.find(c) != -1 : return False 346 return True 347 # Creating an artificial http URI to fool the urlparse module... 348 _scheme, netloc, _url, query, fragment = urlsplit('http:' + val) 349 if netloc != "" and self.state.rdfa_version >= "1.1": 350 self.state.options.add_warning(err_absolute_reference % (netloc, val), UnresolvableReference, node=self.state.node.nodeName) 351 return False 352 elif not char_check(query): 353 self.state.options.add_warning(err_query_reference % (query, val), UnresolvableReference, node=self.state.node.nodeName) 354 return False 355 elif not char_check(fragment): 356 self.state.options.add_warning(err_fragment_reference % (fragment, val), UnresolvableReference, node=self.state.node.nodeName) 357 return False 358 else: 359 return True 360 361 def CURIE_to_URI(self, val): 362 """CURIE to URI mapping. 363 364 This method does I{not} take care of the last step of CURIE processing, ie, the fact that if 365 it does not have a CURIE then the value is used a URI. This is done on the caller's side, because this has 366 to be combined with base, for example. The method I{does} take care of BNode processing, though, ie, 367 CURIE-s of the form "_:XXX". 368 369 @param val: the full CURIE 370 @type val: string 371 @return: URIRef of a URI or None. 372 """ 373 # Just to be on the safe side: 374 if val == "": 375 return None 376 elif val == ":": 377 if self.default_curie_uri: 378 return URIRef(self.default_curie_uri) 379 else: 380 return None 381 382 # See if this is indeed a valid CURIE, ie, it can be split by a colon 383 curie_split = val.split(':',1) 384 if len(curie_split) == 1: 385 # there is no ':' character in the string, ie, it is not a valid CURIE 386 return None 387 else: 388 if self.state.rdfa_version >= "1.1": 389 prefix = curie_split[0].lower() 390 else: 391 prefix = curie_split[0] 392 reference = curie_split[1] 393 394 #if len(reference) > 0: 395 # if self.state.rdfa_version >= "1.1" and (len(prefix) == 0 or prefix in self.ns) and reference.startswith('//'): 396 # # This has been defined as illegal in RDFa 1.1 397 # self.state.options.add_warning(err_absolute_reference % (reference, val), UnresolvableReference, node=self.state.node.nodeName) 398 # return None 399 # if reference[0] == ":": 400 # return None 401 402 # first possibility: empty prefix 403 if len(prefix) == 0: 404 if self.default_curie_uri and self._check_reference(reference): 405 return self.default_curie_uri[reference] 406 else: 407 return None 408 else: 409 # prefix is non-empty; can be a bnode 410 if prefix == "_": 411 # yep, BNode processing. There is a difference whether the reference is empty or not... 412 if len(reference) == 0: 413 return _empty_bnode 414 else: 415 # see if this variable has been used before for a BNode 416 if reference in _bnodes: 417 return _bnodes[reference] 418 else: 419 # a new bnode... 420 retval = BNode() 421 _bnodes[reference] = retval 422 return retval 423 # check if the prefix is a valid NCNAME 424 elif ncname.match(prefix): 425 # see if there is a binding for this: 426 if prefix in self.ns and self._check_reference(reference): 427 # yep, a binding has been defined! 428 if len(reference) == 0: 429 return URIRef(str(self.ns[prefix])) 430 else: 431 return self.ns[prefix][reference] 432 elif prefix in self.default_prefixes and self._check_reference(reference): 433 # this has been defined through the default context 434 if len(reference) == 0: 435 return URIRef(str(self.default_prefixes[prefix][0])) 436 else: 437 (ns,used) = self.default_prefixes[prefix] 438 # lazy binding of prefixes (to avoid unnecessary prefix definitions in the serializations at the end...) 439 if not used: 440 self.graph.bind(prefix,ns) 441 self.default_prefixes[prefix] = (ns,True) 442 return ns[reference] 443 else: 444 # no definition for this thing... 445 return None 446 else: 447 return None 448 # end CURIE_to_URI 449 450 def term_to_URI(self, term): 451 """A term to URI mapping, where term is a simple string and the corresponding 452 URI is defined via the @vocab (ie, default term uri) mechanism. Returns None if term is not defined 453 @param term: string 454 @return: an RDFLib URIRef instance (or None) 455 """ 456 if len(term) == 0 : return None 457 458 if termname.match(term): 459 # It is a valid NCNAME 460 461 # First of all, a @vocab nukes everything. That has to be done first... 462 if self.default_term_uri != None: 463 return URIRef(self.default_term_uri + term) 464 465 # For default terms, the algorithm is (see 7.4.3 of the document): first make a case sensitive match; 466 # if that fails than make a case insensive one 467 # 1. simple, case sensitive test: 468 if term in self.terms: 469 # yep, term is a valid key as is 470 # lazy binding of the xhv prefix for terms... 471 self.graph.bind(XHTML_PREFIX, XHTML_URI) 472 return self.terms[term] 473 474 # 2. case insensitive test 475 for defined_term in self.terms: 476 if term.lower() == defined_term.lower(): 477 # lazy binding of the xhv prefix for terms... 478 self.graph.bind(XHTML_PREFIX, XHTML_URI) 479 return self.terms[defined_term] 480 481 # If it got here, it is all wrong... 482 return None
73class InitialContext: 74 """ 75 Get the initial context values. In most cases this class has an empty content, except for the 76 top level (in case of RDFa 1.1). Each L{TermOrCurie} class has one instance of this class. It provides initial 77 mappings for terms, namespace prefixes, etc, that the top level L{TermOrCurie} instance uses for its own initialization. 78 79 @ivar terms: collection of all term mappings 80 @type terms: dictionary 81 @ivar ns: namespace mapping 82 @type ns: dictionary 83 @ivar vocabulary: default vocabulary 84 @type vocabulary: string 85 """ 86 87 def __init__(self, state, top_level): 88 """ 89 @param state: the state behind this term mapping 90 @type state: L{state.ExecutionContext} 91 @param top_level : whether this is the top node of the DOM tree (the only place where initial contexts are handled) 92 @type top_level : boolean 93 """ 94 self.state = state 95 96 # This is to store the local terms 97 self.terms = {} 98 # This is to store the local Namespaces (a.k.a. prefixes) 99 self.ns = {} 100 # Default vocabulary 101 self.vocabulary = None 102 103 if state.rdfa_version < "1.1" or top_level == False: 104 return 105 106 from .initialcontext import initial_context as context_data 107 from .host import initial_contexts as context_ids 108 from .host import default_vocabulary 109 110 for i in context_ids[state.options.host_language]: 111 # This gives the id of a initial context, valid for this media type: 112 data = context_data[i] 113 114 # Merge the context data with the overall definition 115 if state.options.host_language in default_vocabulary: 116 self.vocabulary = default_vocabulary[state.options.host_language] 117 elif data.vocabulary != "": 118 self.vocabulary = data.vocabulary 119 120 for key in data.terms: 121 self.terms[key] = URIRef(data.terms[key]) 122 for key in data.ns: 123 self.ns[key] = (Namespace(data.ns[key]),False)
Get the initial context values. In most cases this class has an empty content, except for the top level (in case of RDFa 1.1). Each L{TermOrCurie} class has one instance of this class. It provides initial mappings for terms, namespace prefixes, etc, that the top level L{TermOrCurie} instance uses for its own initialization.
@ivar terms: collection of all term mappings @type terms: dictionary @ivar ns: namespace mapping @type ns: dictionary @ivar vocabulary: default vocabulary @type vocabulary: string
87 def __init__(self, state, top_level): 88 """ 89 @param state: the state behind this term mapping 90 @type state: L{state.ExecutionContext} 91 @param top_level : whether this is the top node of the DOM tree (the only place where initial contexts are handled) 92 @type top_level : boolean 93 """ 94 self.state = state 95 96 # This is to store the local terms 97 self.terms = {} 98 # This is to store the local Namespaces (a.k.a. prefixes) 99 self.ns = {} 100 # Default vocabulary 101 self.vocabulary = None 102 103 if state.rdfa_version < "1.1" or top_level == False: 104 return 105 106 from .initialcontext import initial_context as context_data 107 from .host import initial_contexts as context_ids 108 from .host import default_vocabulary 109 110 for i in context_ids[state.options.host_language]: 111 # This gives the id of a initial context, valid for this media type: 112 data = context_data[i] 113 114 # Merge the context data with the overall definition 115 if state.options.host_language in default_vocabulary: 116 self.vocabulary = default_vocabulary[state.options.host_language] 117 elif data.vocabulary != "": 118 self.vocabulary = data.vocabulary 119 120 for key in data.terms: 121 self.terms[key] = URIRef(data.terms[key]) 122 for key in data.ns: 123 self.ns[key] = (Namespace(data.ns[key]),False)
@param state: the state behind this term mapping @type state: L{state.ExecutionContext} @param top_level : whether this is the top node of the DOM tree (the only place where initial contexts are handled) @type top_level : boolean
128class TermOrCurie: 129 """ 130 Wrapper around vocabulary management, ie, mapping a term to a URI, as well as a CURIE to a URI. Each instance of this class belongs to a 131 "state", instance of L{state.ExecutionContext}. Context definitions are managed at initialization time. 132 133 (In fact, this class is, conceptually, part of the overall state at a node, and has been separated here for an 134 easier maintenance.) 135 136 The class takes care of the stack-like behavior of vocabulary items, ie, inheriting everything that is possible 137 from the "parent". At initialization time, this works through the prefix definitions (i.e., C{@prefix} or C{@xmln:} attributes) 138 and/or C{@vocab} attributes. 139 140 @ivar state: State to which this instance belongs 141 @type state: L{state.ExecutionContext} 142 @ivar graph: The RDF Graph under generation 143 @type graph: rdflib.Graph 144 @ivar terms: mapping from terms to URI-s 145 @type terms: dictionary 146 @ivar ns: namespace declarations, ie, mapping from prefixes to URIs 147 @type ns: dictionary 148 @ivar default_curie_uri: URI for a default CURIE 149 """ 150 def __init__(self, state, graph, inherited_state): 151 """Initialize the vocab bound to a specific state. 152 @param state: the state to which this vocab instance belongs to 153 @type state: L{state.ExecutionContext} 154 @param graph: the RDF graph being worked on 155 @type graph: rdflib.Graph 156 @param inherited_state: the state inherited by the current state. 'None' if this is the top level state. 157 @type inherited_state: L{state.ExecutionContext} 158 """ 159 def check_prefix(pr): 160 from . import uri_schemes 161 if pr in uri_schemes: 162 # The prefix being defined is a registered URI scheme, better avoid it... 163 state.options.add_warning(err_redefining_URI_as_prefix % pr, node=state.node.nodeName) 164 165 self.state = state 166 self.graph = graph 167 168 # -------------------------------------------------------------------------------- 169 # This is set to non-void only on the top level and in the case of 1.1 170 default_vocab = InitialContext(self.state, inherited_state == None) 171 172 # Set the default CURIE URI 173 if inherited_state == None: 174 # This is the top level... 175 self.default_curie_uri = Namespace(XHTML_URI) 176 # self.graph.bind(XHTML_PREFIX, self.default_curie_uri) 177 else: 178 self.default_curie_uri = inherited_state.term_or_curie.default_curie_uri 179 180 # -------------------------------------------------------------------------------- 181 # Set the default term URI 182 # This is a 1.1 feature, ie, should be ignored if the version is < 1.0 183 if state.rdfa_version >= "1.1": 184 # that is the absolute default setup... 185 if inherited_state == None: 186 self.default_term_uri = None 187 else: 188 self.default_term_uri = inherited_state.term_or_curie.default_term_uri 189 190 # see if the initial context has defined a default vocabulary: 191 if default_vocab.vocabulary: 192 self.default_term_uri = default_vocab.vocabulary 193 194 # see if there is local vocab that would override previous settings 195 # However, care should be taken with the vocab="" value that should not become a URI... 196 # Indeed, this value is used to 'vipe out', ie, get back to the default vocabulary... 197 if self.state.node.hasAttribute("vocab") and self.state.node.getAttribute("vocab") == "": 198 self.default_term_uri = default_vocab.vocabulary 199 else: 200 def_term_uri = self.state.getURI("vocab") 201 if def_term_uri and def_term_uri != "" : 202 self.default_term_uri = def_term_uri 203 self.graph.add((URIRef(self.state.base),RDFA_VOCAB,URIRef(def_term_uri))) 204 else: 205 self.default_term_uri = None 206 207 # -------------------------------------------------------------------------------- 208 # The simpler case: terms, adding those that have been defined by a possible initial context 209 if inherited_state is None: 210 # this is the vocabulary belonging to the top level of the tree! 211 self.terms = {} 212 if state.rdfa_version >= "1.1": 213 # Simply get the terms defined by the default vocabularies. There is no need for merging 214 for key in default_vocab.terms: 215 self.terms[key] = default_vocab.terms[key] 216 else: 217 # The terms are hardwired... 218 for key in predefined_1_0_rel: 219 self.terms[key] = URIRef(XHTML_URI + key) 220 else: 221 # just refer to the inherited terms 222 self.terms = inherited_state.term_or_curie.terms 223 224 #----------------------------------------------------------------- 225 # the locally defined namespaces 226 ns_dict = {} 227 # locally defined xmlns namespaces, necessary for correct XML Literal generation 228 xmlns_dict = {} 229 230 # Add the locally defined namespaces using the xmlns: syntax 231 for i in range(0, state.node.attributes.length): 232 attr = state.node.attributes.item(i) 233 if attr.name.find('xmlns:') == 0 : 234 # yep, there is a namespace setting 235 prefix = attr.localName 236 if prefix != "" : # exclude the top level xmlns setting... 237 if state.rdfa_version >= "1.1" and state.options.host_language in warn_xmlns_usage: 238 state.options.add_warning(err_xmlns_deprecated % prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 239 if prefix == "_": 240 state.options.add_warning(err_bnode_local_prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 241 elif prefix.find(':') != -1: 242 state.options.add_warning(err_col_local_prefix % prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 243 else : 244 # quote the URI, ie, convert special characters into %.. This is 245 # true, for example, for spaces 246 uri = quote_URI(attr.value, state.options) 247 # create a new RDFLib Namespace entry 248 ns = Namespace(uri) 249 # Add an entry to the dictionary if not already there (priority is left to right!) 250 if state.rdfa_version >= "1.1": 251 pr = prefix.lower() 252 else: 253 pr = prefix 254 ns_dict[pr] = ns 255 xmlns_dict[pr] = ns 256 self.graph.bind(pr,ns) 257 check_prefix(pr) 258 259 # Add the locally defined namespaces using the @prefix syntax 260 # this may override the definition @xmlns 261 if state.rdfa_version >= "1.1" and state.node.hasAttribute("prefix"): 262 pr = state.node.getAttribute("prefix") 263 if pr != None: 264 # separator character is whitespace 265 pr_list = pr.strip().split() 266 # range(0, len(pr_list), 2) 267 for i in range(len(pr_list) - 2, -1, -2): 268 prefix = pr_list[i] 269 # see if there is a URI at all 270 if i == len(pr_list) - 1: 271 state.options.add_warning(err_missing_URI_prefix % (prefix,pr), node=state.node.nodeName) 272 break 273 else: 274 value = pr_list[i+1] 275 276 # see if the value of prefix is o.k., ie, there is a ':' at the end 277 if prefix[-1] != ':': 278 state.options.add_warning(err_invalid_prefix % (prefix,pr), IncorrectPrefixDefinition, node=state.node.nodeName) 279 continue 280 elif prefix == ":": 281 state.options.add_warning(err_no_default_prefix % pr, IncorrectPrefixDefinition, node=state.node.nodeName) 282 continue 283 else: 284 prefix = prefix[:-1] 285 uri = Namespace(quote_URI(value, state.options)) 286 if prefix == "": 287 #something to be done here 288 self.default_curie_uri = uri 289 elif prefix == "_": 290 state.options.add_warning(err_bnode_local_prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 291 else: 292 # last check: is the prefix an NCNAME? 293 if ncname.match(prefix): 294 real_prefix = prefix.lower() 295 ns_dict[real_prefix] = uri 296 self.graph.bind(real_prefix,uri) 297 # Additional warning: is this prefix overriding an existing xmlns statement with a different URI? if 298 # so, that may lead to discrepancies between an RDFa 1.0 and RDFa 1.1 run... 299 if (prefix in xmlns_dict and xmlns_dict[prefix] != uri) or (real_prefix in xmlns_dict and xmlns_dict[real_prefix] != uri): 300 state.options.add_warning(err_prefix_and_xmlns % (real_prefix,real_prefix), node=state.node.nodeName) 301 check_prefix(real_prefix) 302 303 else: 304 state.options.add_warning(err_non_ncname_prefix % (prefix,pr), IncorrectPrefixDefinition, node=state.node.nodeName) 305 306 # See if anything has been collected at all. 307 # If not, the namespaces of the incoming state is 308 # taken over by reference. Otherwise that is copied to the 309 # the local dictionary 310 if inherited_state == None: 311 self.default_prefixes = default_vocab.ns 312 inherited_prefixes = {} 313 else: 314 self.default_prefixes = inherited_state.term_or_curie.default_prefixes 315 inherited_prefixes = inherited_state.term_or_curie.ns 316 317 if len(ns_dict) == 0: 318 self.ns = inherited_prefixes 319 else: 320 self.ns = {} 321 for key in inherited_prefixes : self.ns[key] = inherited_prefixes[key] 322 for key in ns_dict : 323 if (key in inherited_prefixes and ns_dict[key] != inherited_prefixes[key]) or (key in self.default_prefixes and ns_dict[key] != self.default_prefixes[key][0]): 324 state.options.add_warning(err_prefix_redefinition % key, PrefixRedefinitionWarning, node=state.node.nodeName) 325 self.ns[key] = ns_dict[key] 326 327 328 # the xmlns prefixes have to be stored separately, again for XML Literal generation 329 self.xmlns = {} 330 if len(xmlns_dict) == 0 and inherited_state: 331 self.xmlns = inherited_state.term_or_curie.xmlns 332 else: 333 if inherited_state: 334 for key in inherited_state.term_or_curie.xmlns : self.xmlns[key] = inherited_state.term_or_curie.xmlns[key] 335 for key in xmlns_dict : self.xmlns[key] = xmlns_dict[key] 336 else: 337 self.xmlns = xmlns_dict 338 # end __init__ 339 340 def _check_reference(self, val): 341 """Checking the CURIE reference for correctness. It is probably not 100% foolproof, but may take care 342 of some of the possible errors. See the URI RFC for the details. 343 """ 344 def char_check(s, not_allowed = ['#','[',']']): 345 for c in not_allowed: 346 if s.find(c) != -1 : return False 347 return True 348 # Creating an artificial http URI to fool the urlparse module... 349 _scheme, netloc, _url, query, fragment = urlsplit('http:' + val) 350 if netloc != "" and self.state.rdfa_version >= "1.1": 351 self.state.options.add_warning(err_absolute_reference % (netloc, val), UnresolvableReference, node=self.state.node.nodeName) 352 return False 353 elif not char_check(query): 354 self.state.options.add_warning(err_query_reference % (query, val), UnresolvableReference, node=self.state.node.nodeName) 355 return False 356 elif not char_check(fragment): 357 self.state.options.add_warning(err_fragment_reference % (fragment, val), UnresolvableReference, node=self.state.node.nodeName) 358 return False 359 else: 360 return True 361 362 def CURIE_to_URI(self, val): 363 """CURIE to URI mapping. 364 365 This method does I{not} take care of the last step of CURIE processing, ie, the fact that if 366 it does not have a CURIE then the value is used a URI. This is done on the caller's side, because this has 367 to be combined with base, for example. The method I{does} take care of BNode processing, though, ie, 368 CURIE-s of the form "_:XXX". 369 370 @param val: the full CURIE 371 @type val: string 372 @return: URIRef of a URI or None. 373 """ 374 # Just to be on the safe side: 375 if val == "": 376 return None 377 elif val == ":": 378 if self.default_curie_uri: 379 return URIRef(self.default_curie_uri) 380 else: 381 return None 382 383 # See if this is indeed a valid CURIE, ie, it can be split by a colon 384 curie_split = val.split(':',1) 385 if len(curie_split) == 1: 386 # there is no ':' character in the string, ie, it is not a valid CURIE 387 return None 388 else: 389 if self.state.rdfa_version >= "1.1": 390 prefix = curie_split[0].lower() 391 else: 392 prefix = curie_split[0] 393 reference = curie_split[1] 394 395 #if len(reference) > 0: 396 # if self.state.rdfa_version >= "1.1" and (len(prefix) == 0 or prefix in self.ns) and reference.startswith('//'): 397 # # This has been defined as illegal in RDFa 1.1 398 # self.state.options.add_warning(err_absolute_reference % (reference, val), UnresolvableReference, node=self.state.node.nodeName) 399 # return None 400 # if reference[0] == ":": 401 # return None 402 403 # first possibility: empty prefix 404 if len(prefix) == 0: 405 if self.default_curie_uri and self._check_reference(reference): 406 return self.default_curie_uri[reference] 407 else: 408 return None 409 else: 410 # prefix is non-empty; can be a bnode 411 if prefix == "_": 412 # yep, BNode processing. There is a difference whether the reference is empty or not... 413 if len(reference) == 0: 414 return _empty_bnode 415 else: 416 # see if this variable has been used before for a BNode 417 if reference in _bnodes: 418 return _bnodes[reference] 419 else: 420 # a new bnode... 421 retval = BNode() 422 _bnodes[reference] = retval 423 return retval 424 # check if the prefix is a valid NCNAME 425 elif ncname.match(prefix): 426 # see if there is a binding for this: 427 if prefix in self.ns and self._check_reference(reference): 428 # yep, a binding has been defined! 429 if len(reference) == 0: 430 return URIRef(str(self.ns[prefix])) 431 else: 432 return self.ns[prefix][reference] 433 elif prefix in self.default_prefixes and self._check_reference(reference): 434 # this has been defined through the default context 435 if len(reference) == 0: 436 return URIRef(str(self.default_prefixes[prefix][0])) 437 else: 438 (ns,used) = self.default_prefixes[prefix] 439 # lazy binding of prefixes (to avoid unnecessary prefix definitions in the serializations at the end...) 440 if not used: 441 self.graph.bind(prefix,ns) 442 self.default_prefixes[prefix] = (ns,True) 443 return ns[reference] 444 else: 445 # no definition for this thing... 446 return None 447 else: 448 return None 449 # end CURIE_to_URI 450 451 def term_to_URI(self, term): 452 """A term to URI mapping, where term is a simple string and the corresponding 453 URI is defined via the @vocab (ie, default term uri) mechanism. Returns None if term is not defined 454 @param term: string 455 @return: an RDFLib URIRef instance (or None) 456 """ 457 if len(term) == 0 : return None 458 459 if termname.match(term): 460 # It is a valid NCNAME 461 462 # First of all, a @vocab nukes everything. That has to be done first... 463 if self.default_term_uri != None: 464 return URIRef(self.default_term_uri + term) 465 466 # For default terms, the algorithm is (see 7.4.3 of the document): first make a case sensitive match; 467 # if that fails than make a case insensive one 468 # 1. simple, case sensitive test: 469 if term in self.terms: 470 # yep, term is a valid key as is 471 # lazy binding of the xhv prefix for terms... 472 self.graph.bind(XHTML_PREFIX, XHTML_URI) 473 return self.terms[term] 474 475 # 2. case insensitive test 476 for defined_term in self.terms: 477 if term.lower() == defined_term.lower(): 478 # lazy binding of the xhv prefix for terms... 479 self.graph.bind(XHTML_PREFIX, XHTML_URI) 480 return self.terms[defined_term] 481 482 # If it got here, it is all wrong... 483 return None
Wrapper around vocabulary management, ie, mapping a term to a URI, as well as a CURIE to a URI. Each instance of this class belongs to a "state", instance of L{state.ExecutionContext}. Context definitions are managed at initialization time.
(In fact, this class is, conceptually, part of the overall state at a node, and has been separated here for an easier maintenance.)
The class takes care of the stack-like behavior of vocabulary items, ie, inheriting everything that is possible from the "parent". At initialization time, this works through the prefix definitions (i.e., C{@prefix} or C{@xmln:} attributes) and/or C{@vocab} attributes.
@ivar state: State to which this instance belongs @type state: L{state.ExecutionContext} @ivar graph: The RDF Graph under generation @type graph: rdflib.Graph @ivar terms: mapping from terms to URI-s @type terms: dictionary @ivar ns: namespace declarations, ie, mapping from prefixes to URIs @type ns: dictionary @ivar default_curie_uri: URI for a default CURIE
150 def __init__(self, state, graph, inherited_state): 151 """Initialize the vocab bound to a specific state. 152 @param state: the state to which this vocab instance belongs to 153 @type state: L{state.ExecutionContext} 154 @param graph: the RDF graph being worked on 155 @type graph: rdflib.Graph 156 @param inherited_state: the state inherited by the current state. 'None' if this is the top level state. 157 @type inherited_state: L{state.ExecutionContext} 158 """ 159 def check_prefix(pr): 160 from . import uri_schemes 161 if pr in uri_schemes: 162 # The prefix being defined is a registered URI scheme, better avoid it... 163 state.options.add_warning(err_redefining_URI_as_prefix % pr, node=state.node.nodeName) 164 165 self.state = state 166 self.graph = graph 167 168 # -------------------------------------------------------------------------------- 169 # This is set to non-void only on the top level and in the case of 1.1 170 default_vocab = InitialContext(self.state, inherited_state == None) 171 172 # Set the default CURIE URI 173 if inherited_state == None: 174 # This is the top level... 175 self.default_curie_uri = Namespace(XHTML_URI) 176 # self.graph.bind(XHTML_PREFIX, self.default_curie_uri) 177 else: 178 self.default_curie_uri = inherited_state.term_or_curie.default_curie_uri 179 180 # -------------------------------------------------------------------------------- 181 # Set the default term URI 182 # This is a 1.1 feature, ie, should be ignored if the version is < 1.0 183 if state.rdfa_version >= "1.1": 184 # that is the absolute default setup... 185 if inherited_state == None: 186 self.default_term_uri = None 187 else: 188 self.default_term_uri = inherited_state.term_or_curie.default_term_uri 189 190 # see if the initial context has defined a default vocabulary: 191 if default_vocab.vocabulary: 192 self.default_term_uri = default_vocab.vocabulary 193 194 # see if there is local vocab that would override previous settings 195 # However, care should be taken with the vocab="" value that should not become a URI... 196 # Indeed, this value is used to 'vipe out', ie, get back to the default vocabulary... 197 if self.state.node.hasAttribute("vocab") and self.state.node.getAttribute("vocab") == "": 198 self.default_term_uri = default_vocab.vocabulary 199 else: 200 def_term_uri = self.state.getURI("vocab") 201 if def_term_uri and def_term_uri != "" : 202 self.default_term_uri = def_term_uri 203 self.graph.add((URIRef(self.state.base),RDFA_VOCAB,URIRef(def_term_uri))) 204 else: 205 self.default_term_uri = None 206 207 # -------------------------------------------------------------------------------- 208 # The simpler case: terms, adding those that have been defined by a possible initial context 209 if inherited_state is None: 210 # this is the vocabulary belonging to the top level of the tree! 211 self.terms = {} 212 if state.rdfa_version >= "1.1": 213 # Simply get the terms defined by the default vocabularies. There is no need for merging 214 for key in default_vocab.terms: 215 self.terms[key] = default_vocab.terms[key] 216 else: 217 # The terms are hardwired... 218 for key in predefined_1_0_rel: 219 self.terms[key] = URIRef(XHTML_URI + key) 220 else: 221 # just refer to the inherited terms 222 self.terms = inherited_state.term_or_curie.terms 223 224 #----------------------------------------------------------------- 225 # the locally defined namespaces 226 ns_dict = {} 227 # locally defined xmlns namespaces, necessary for correct XML Literal generation 228 xmlns_dict = {} 229 230 # Add the locally defined namespaces using the xmlns: syntax 231 for i in range(0, state.node.attributes.length): 232 attr = state.node.attributes.item(i) 233 if attr.name.find('xmlns:') == 0 : 234 # yep, there is a namespace setting 235 prefix = attr.localName 236 if prefix != "" : # exclude the top level xmlns setting... 237 if state.rdfa_version >= "1.1" and state.options.host_language in warn_xmlns_usage: 238 state.options.add_warning(err_xmlns_deprecated % prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 239 if prefix == "_": 240 state.options.add_warning(err_bnode_local_prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 241 elif prefix.find(':') != -1: 242 state.options.add_warning(err_col_local_prefix % prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 243 else : 244 # quote the URI, ie, convert special characters into %.. This is 245 # true, for example, for spaces 246 uri = quote_URI(attr.value, state.options) 247 # create a new RDFLib Namespace entry 248 ns = Namespace(uri) 249 # Add an entry to the dictionary if not already there (priority is left to right!) 250 if state.rdfa_version >= "1.1": 251 pr = prefix.lower() 252 else: 253 pr = prefix 254 ns_dict[pr] = ns 255 xmlns_dict[pr] = ns 256 self.graph.bind(pr,ns) 257 check_prefix(pr) 258 259 # Add the locally defined namespaces using the @prefix syntax 260 # this may override the definition @xmlns 261 if state.rdfa_version >= "1.1" and state.node.hasAttribute("prefix"): 262 pr = state.node.getAttribute("prefix") 263 if pr != None: 264 # separator character is whitespace 265 pr_list = pr.strip().split() 266 # range(0, len(pr_list), 2) 267 for i in range(len(pr_list) - 2, -1, -2): 268 prefix = pr_list[i] 269 # see if there is a URI at all 270 if i == len(pr_list) - 1: 271 state.options.add_warning(err_missing_URI_prefix % (prefix,pr), node=state.node.nodeName) 272 break 273 else: 274 value = pr_list[i+1] 275 276 # see if the value of prefix is o.k., ie, there is a ':' at the end 277 if prefix[-1] != ':': 278 state.options.add_warning(err_invalid_prefix % (prefix,pr), IncorrectPrefixDefinition, node=state.node.nodeName) 279 continue 280 elif prefix == ":": 281 state.options.add_warning(err_no_default_prefix % pr, IncorrectPrefixDefinition, node=state.node.nodeName) 282 continue 283 else: 284 prefix = prefix[:-1] 285 uri = Namespace(quote_URI(value, state.options)) 286 if prefix == "": 287 #something to be done here 288 self.default_curie_uri = uri 289 elif prefix == "_": 290 state.options.add_warning(err_bnode_local_prefix, IncorrectPrefixDefinition, node=state.node.nodeName) 291 else: 292 # last check: is the prefix an NCNAME? 293 if ncname.match(prefix): 294 real_prefix = prefix.lower() 295 ns_dict[real_prefix] = uri 296 self.graph.bind(real_prefix,uri) 297 # Additional warning: is this prefix overriding an existing xmlns statement with a different URI? if 298 # so, that may lead to discrepancies between an RDFa 1.0 and RDFa 1.1 run... 299 if (prefix in xmlns_dict and xmlns_dict[prefix] != uri) or (real_prefix in xmlns_dict and xmlns_dict[real_prefix] != uri): 300 state.options.add_warning(err_prefix_and_xmlns % (real_prefix,real_prefix), node=state.node.nodeName) 301 check_prefix(real_prefix) 302 303 else: 304 state.options.add_warning(err_non_ncname_prefix % (prefix,pr), IncorrectPrefixDefinition, node=state.node.nodeName) 305 306 # See if anything has been collected at all. 307 # If not, the namespaces of the incoming state is 308 # taken over by reference. Otherwise that is copied to the 309 # the local dictionary 310 if inherited_state == None: 311 self.default_prefixes = default_vocab.ns 312 inherited_prefixes = {} 313 else: 314 self.default_prefixes = inherited_state.term_or_curie.default_prefixes 315 inherited_prefixes = inherited_state.term_or_curie.ns 316 317 if len(ns_dict) == 0: 318 self.ns = inherited_prefixes 319 else: 320 self.ns = {} 321 for key in inherited_prefixes : self.ns[key] = inherited_prefixes[key] 322 for key in ns_dict : 323 if (key in inherited_prefixes and ns_dict[key] != inherited_prefixes[key]) or (key in self.default_prefixes and ns_dict[key] != self.default_prefixes[key][0]): 324 state.options.add_warning(err_prefix_redefinition % key, PrefixRedefinitionWarning, node=state.node.nodeName) 325 self.ns[key] = ns_dict[key] 326 327 328 # the xmlns prefixes have to be stored separately, again for XML Literal generation 329 self.xmlns = {} 330 if len(xmlns_dict) == 0 and inherited_state: 331 self.xmlns = inherited_state.term_or_curie.xmlns 332 else: 333 if inherited_state: 334 for key in inherited_state.term_or_curie.xmlns : self.xmlns[key] = inherited_state.term_or_curie.xmlns[key] 335 for key in xmlns_dict : self.xmlns[key] = xmlns_dict[key] 336 else: 337 self.xmlns = xmlns_dict
Initialize the vocab bound to a specific state. @param state: the state to which this vocab instance belongs to @type state: L{state.ExecutionContext} @param graph: the RDF graph being worked on @type graph: rdflib.Graph @param inherited_state: the state inherited by the current state. 'None' if this is the top level state. @type inherited_state: L{state.ExecutionContext}
362 def CURIE_to_URI(self, val): 363 """CURIE to URI mapping. 364 365 This method does I{not} take care of the last step of CURIE processing, ie, the fact that if 366 it does not have a CURIE then the value is used a URI. This is done on the caller's side, because this has 367 to be combined with base, for example. The method I{does} take care of BNode processing, though, ie, 368 CURIE-s of the form "_:XXX". 369 370 @param val: the full CURIE 371 @type val: string 372 @return: URIRef of a URI or None. 373 """ 374 # Just to be on the safe side: 375 if val == "": 376 return None 377 elif val == ":": 378 if self.default_curie_uri: 379 return URIRef(self.default_curie_uri) 380 else: 381 return None 382 383 # See if this is indeed a valid CURIE, ie, it can be split by a colon 384 curie_split = val.split(':',1) 385 if len(curie_split) == 1: 386 # there is no ':' character in the string, ie, it is not a valid CURIE 387 return None 388 else: 389 if self.state.rdfa_version >= "1.1": 390 prefix = curie_split[0].lower() 391 else: 392 prefix = curie_split[0] 393 reference = curie_split[1] 394 395 #if len(reference) > 0: 396 # if self.state.rdfa_version >= "1.1" and (len(prefix) == 0 or prefix in self.ns) and reference.startswith('//'): 397 # # This has been defined as illegal in RDFa 1.1 398 # self.state.options.add_warning(err_absolute_reference % (reference, val), UnresolvableReference, node=self.state.node.nodeName) 399 # return None 400 # if reference[0] == ":": 401 # return None 402 403 # first possibility: empty prefix 404 if len(prefix) == 0: 405 if self.default_curie_uri and self._check_reference(reference): 406 return self.default_curie_uri[reference] 407 else: 408 return None 409 else: 410 # prefix is non-empty; can be a bnode 411 if prefix == "_": 412 # yep, BNode processing. There is a difference whether the reference is empty or not... 413 if len(reference) == 0: 414 return _empty_bnode 415 else: 416 # see if this variable has been used before for a BNode 417 if reference in _bnodes: 418 return _bnodes[reference] 419 else: 420 # a new bnode... 421 retval = BNode() 422 _bnodes[reference] = retval 423 return retval 424 # check if the prefix is a valid NCNAME 425 elif ncname.match(prefix): 426 # see if there is a binding for this: 427 if prefix in self.ns and self._check_reference(reference): 428 # yep, a binding has been defined! 429 if len(reference) == 0: 430 return URIRef(str(self.ns[prefix])) 431 else: 432 return self.ns[prefix][reference] 433 elif prefix in self.default_prefixes and self._check_reference(reference): 434 # this has been defined through the default context 435 if len(reference) == 0: 436 return URIRef(str(self.default_prefixes[prefix][0])) 437 else: 438 (ns,used) = self.default_prefixes[prefix] 439 # lazy binding of prefixes (to avoid unnecessary prefix definitions in the serializations at the end...) 440 if not used: 441 self.graph.bind(prefix,ns) 442 self.default_prefixes[prefix] = (ns,True) 443 return ns[reference] 444 else: 445 # no definition for this thing... 446 return None 447 else: 448 return None
CURIE to URI mapping.
This method does I{not} take care of the last step of CURIE processing, ie, the fact that if it does not have a CURIE then the value is used a URI. This is done on the caller's side, because this has to be combined with base, for example. The method I{does} take care of BNode processing, though, ie, CURIE-s of the form "_:XXX".
@param val: the full CURIE @type val: string @return: URIRef of a URI or None.
451 def term_to_URI(self, term): 452 """A term to URI mapping, where term is a simple string and the corresponding 453 URI is defined via the @vocab (ie, default term uri) mechanism. Returns None if term is not defined 454 @param term: string 455 @return: an RDFLib URIRef instance (or None) 456 """ 457 if len(term) == 0 : return None 458 459 if termname.match(term): 460 # It is a valid NCNAME 461 462 # First of all, a @vocab nukes everything. That has to be done first... 463 if self.default_term_uri != None: 464 return URIRef(self.default_term_uri + term) 465 466 # For default terms, the algorithm is (see 7.4.3 of the document): first make a case sensitive match; 467 # if that fails than make a case insensive one 468 # 1. simple, case sensitive test: 469 if term in self.terms: 470 # yep, term is a valid key as is 471 # lazy binding of the xhv prefix for terms... 472 self.graph.bind(XHTML_PREFIX, XHTML_URI) 473 return self.terms[term] 474 475 # 2. case insensitive test 476 for defined_term in self.terms: 477 if term.lower() == defined_term.lower(): 478 # lazy binding of the xhv prefix for terms... 479 self.graph.bind(XHTML_PREFIX, XHTML_URI) 480 return self.terms[defined_term] 481 482 # If it got here, it is all wrong... 483 return None
A term to URI mapping, where term is a simple string and the corresponding URI is defined via the @vocab (ie, default term uri) mechanism. Returns None if term is not defined @param term: string @return: an RDFLib URIRef instance (or None)