Skip to main content
ilovecalcs logoilovecalcs.

Dev · Live

XML Formatter & Validator, beautify, minify & check well-formedness.

Paste raw XML to format it with configurable indentation, compress it to a single line, or validate that it is well-formed — with precise line-and-column error reporting. Uses the browser's native XML parser. Nothing leaves your device.

XML guideReal-time
Indent:
Invalid XML
XML Input
19 lines · 795 B
Formatted XML
 

Reference

XML node types & syntax

Element<tag attr="val">content</tag>
Self-close<tag attr="val"/>
Attributename="value"
Text nodePlain text content
Comment<!-- comment text -->
CDATA<![CDATA[raw & text]]>
XML Decl.<?xml version="1.0"?>
Proc. instr.<?target data?>

XML requires exactly one root element · All tags must be closed · Attribute values must be quoted · Case-sensitive tag names

XML reference

XML fundamentals, well-formedness rules, and common use cases.

What is XML?

XML (Extensible Markup Language) is a text-based data serialisation format defined by the W3C in 1998. Unlike HTML, which has a fixed set of element names, XML is extensible — any document author can define their own elements and attributes. The format is both human-readable and machine-parseable, and it remains the dominant format for configuration files, data exchange protocols (SOAP, RSS, Atom), office documents (OOXML, ODF), vector graphics (SVG), and many enterprise systems.

Well-formedness vs validity

XML distinguishes between two levels of correctness:

  • Well-formed XML follows all the syntactic rules of the XML specification: exactly one root element, all tags properly closed, all attributes quoted, correct case-sensitivity, and proper escaping of special characters. This is what this tool validates.
  • Valid XML additionally conforms to a declared schema — a DTD, XSD (XML Schema Definition), or RELAX NG schema. Validity checking requires a schema against which to validate and is beyond the scope of a browser-side formatter. Use tools like xmllint or Oxygen XML Editor for full validation.

Core well-formedness rules

  • One root element: An XML document must have exactly one top-level element that contains all other elements. Multiple root-level elements (<a/><b/>) are not well-formed; wrap them in a single container (<root><a/><b/></root>).
  • All tags must be closed: Every opening tag must have a matching closing tag. The self-closing shorthand <tag/>is equivalent to <tag></tag> for empty elements. HTML's void elements (<br>, <img>) are not valid in XML without the self-closing slash.
  • Proper nesting: Tags must not overlap. The sequence<a><b></a></b> is not well-formed because <b> opens inside <a> but closes outside it.
  • Attribute values must be quoted: All attribute values must be enclosed in either double quotes (attr="value") or single quotes (attr='value'). Unquoted attributes valid in HTML are not permitted in XML.
  • Case-sensitive tag names: Unlike HTML, XML is case-sensitive.<Foo> and <foo> are different elements. The opening and closing tags must use the same case: <Foo></foo>is not well-formed.
  • Special characters must be escaped: The characters<, >, and & in text content and attribute values must be escaped as &lt;,&gt;, and &amp;. Additionally,&quot; inside double-quoted attributes and &apos;inside single-quoted attributes.

XML node types

  • Elements: The structural building blocks of XML. Elements have a start tag, optional attributes, optional content (text or child elements), and an end tag. They may be self-closing if empty.
  • Attributes: Name-value pairs that provide metadata about an element. Unlike child elements, attributes have no order and cannot appear multiple times on the same element with the same name.
  • Text content: Character data between tags. Whitespace is significant in XML — the processor delivers all whitespace characters to the application. Attribute normalisation collapses whitespace differently.
  • Comments: <!-- comment -->. Comments may appear anywhere except inside a tag. They are passed through by most processors but are advisory only. Comments may not contain the sequence-- (double hyphen).
  • CDATA sections: <![CDATA[...]]>. Allow raw character data that would otherwise require escaping. Everything inside a CDATA section is treated as literal text until the closing]]> delimiter is reached. Useful for embedding HTML fragments, code, or SQL queries in XML.
  • Processing instructions: <?target data?>. Pass information to the consuming application. The XML declaration itself (<?xml version="1.0"?>) is technically a special processing instruction.

XML namespaces

XML namespaces prevent name collisions when combining XML from different vocabularies. A namespace is declared with the xmlns attribute and associated with a URI:

<root xmlns:xhtml="http://www.w3.org/1999/xhtml"> <xhtml:div>content</xhtml:div> </root>

The namespace URI is just an identifier — the XML processor does not fetch any resource at that URL. The prefix (xhtml:) is local to the document; the namespace is defined by the URI, not the prefix.

XML vs JSON: when to use each

  • Use XML when: your data has mixed content (text interspersed with markup, like XHTML or DocBook), you need comments in the data, you need namespaces to combine vocabularies, you are working with established XML standards (RSS/Atom, SOAP, SVG, OOXML), or your toolchain requires XPath/XSLT processing.
  • Use JSON when: you are building a REST API, your data is simple key-value or array structures, you are targeting JavaScript frontends, or you need smaller payload sizes (JSON is typically 30–40% smaller than equivalent XML for the same data).

For most new web APIs, JSON is the preferred choice. XML remains dominant in enterprise integration, document formats, and any context where its richer metadata capabilities (namespaces, CDATA, DTD) are needed.

How this formatter works

This tool uses the browser's native DOMParser with theapplication/xml MIME type to parse the input. The parser is the same engine the browser uses to parse SVG and XHTML. When parsing fails, the XML parser returns a document containing a <parsererror>element whose text content includes the line and column of the first error.

The formatted output is produced by a recursive DOM serialiser that walks the parsed document tree, re-escaping special characters correctly and applying indentation. This round-trip approach guarantees the output is well-formed regardless of what the input looked like.