Today, developers have access to powerful XML parser and XML technology enabled the developers to offload the marshalling issues. No more they are required to perform custom serialization. There exists a generic approach to handle different data structures and simple-easy approach to transform XML documents into business objects.
While highly generic and dynamic components made life easy for developers, they also serve as the foundation for XML parser based attacks. This includes DoS – Denial of Service Attacks, inclusion of local files into XML documents, port scanning from the system where the XML parser is located, overloading of XML-Schema from foreign locations.
To start with, it has become easier to generate XML document in windows machine and consume in Linux machine, and there is possibility that implementation of libraries that generates XML may not be same as the ones ones that parse uploaded files.
A quick recall on XML and its elements. XML is a standard for exchanging structured data in textual format. Format of XML document is defined by either Document Type Definition (DTD) or XML Schema. A XML document is Well-formed if document adheres to the XML syntax specification and is valid if document adheres to the DTD or XML schema. When used incorrectly, these document definition and validation features can lead to security vulnerabilities in applications using XML.
A quick recall on DTD and its constituents. A DTD is a declarative syntax used to specify how elements and references appear for a document of a particular type. The document can be checked whether it is well-formed using a DTD. In addition, entities can be declared in the DTD to define variables (similar to textual macros) to be used later in the DTD or XML document. To resolve external entities, an XML parser consults various networking protocols and services (DNS, FTP, HTTP, SMB, etc.) depending on the scheme (protocol) specified in URLs. External entities are used to create dynamic references so that any changes made to the referenced are automatically updated in the document.
[Some content comes from other sites and entire Code snippets].
Injection of a XML fragment XML generators build the XML documents. Depending on the generator, injection of XML document fragments can be attack. When more XML generators in the front-end, attacker injects fragments and send to server. Here, you see Injection of a XML fragment into the comment field of a online banking payment form (marked red)
Addition of other file in DTD – Processing external entities, Loading of content from local devices DTD allows the inclusion of documents XML documents (web.xml) and any other file ( /etc/passwd). The provided example is to reflect/emphasize the seriousness of the issue as most XML parser may find difficult to parse this file since XML parsers often require the content to be parseable
The attacker includes a short DTD in the document to define the “file” external entity, which references a configuration file local to the vulnerable application. When XML gets evaluated, contents of the configuration file gets included inline for the Designation field. XML parser evaluates the entities occurs within the XML parser, the application receiving this request has no simple way to know that the content in the Designation field was actually not a literal string.
Once this is done, the attacker needs to ask the application for previously submitted employee profile information and would receive the information along with the desired file contents. In this approach, simple text files can be retrieved.Any XML special characters will generate a parse error and typically blocks inclusion of the entity in the document. The application does not know about the parse error at the moment of entity inclusion, not during subsequent parsing of the containing XML.
Another similar exploit is “billion laughs” attack, done by defining nested entities within an XML DTD to build an XML memory bomb.
While XML specifications do not require any specific URL schemes to be supported, each XML parser has support for a set of URL schemes. Some platforms expose all URL schemes supported by underlying networking libraries.By invoking URLs from within XML external entities, an attacker can leverage the system hosting the XML parser to initiate potentially malicious requests to third-party systems. These “server-side request forgery” (SSRF) techniques can allow for attacks against other internal services, even ones local to the machine(not otherwise exposed).