Wiki source code of Confluence XML Format
Last modified by Vincent Massol on 2026/04/08 13:50
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | {{toc/}} | ||
| 2 | |||
| 3 | == Scope == | ||
| 4 | |||
| 5 | This document presents the format of the Confluence export packages. It is targeted at: | ||
| 6 | |||
| 7 | * technical people working on Confluence migration tools or tools involving Confluence exports | ||
| 8 | * technical people running migrations who need to deeply investigate some issues | ||
| 9 | * people curious about the Confluence export format | ||
| 10 | |||
| 11 | == Introduction == | ||
| 12 | |||
| 13 | A Confluence export package is a zip file containing an ##attachments## folder, an ##exportDescriptor.properties## file, and an ##entities.xml## file. It also sometimes contains a config and a plugin-data folder as well as other files which we haven't been using so far. | ||
| 14 | |||
| 15 | We know about two kinds of Confluence backup packages: | ||
| 16 | |||
| 17 | * A **space backup package** is produced from the settings of a space in Confluence. The ##exportDescriptor.properties## contains the name of the space selected for export from Confluence. It contains information about the exported space, and notifications, pages, attachments, permissions related to this space. It doesn't contain the attachments folder if the option to leave attachments out was selected. | ||
| 18 | * A **site backup package** is produced from the global administration in Confluence. It contains all the spaces, as well as **users and groups**, but it **doesn't contain the attachments** folder. | ||
| 19 | |||
| 20 | == The entities.xml file == | ||
| 21 | |||
| 22 | === Overview === | ||
| 23 | |||
| 24 | This is an XML 1.0 UTF-8 file that looks like a **dump of the hibernate database of Confluence**. It is close to the [[Confluence SQL schema>>https://confluence.atlassian.com/doc/confluence-data-model-127369837.html]] , but not exactly the same. The differences probably come from their Hibernate configuration. | ||
| 25 | |||
| 26 | It starts with an XML prolog, and everything is contained in a ##hibernate-generic## root node that has a ##datetime## attribute that contains the date of the export following the ##YYY-MM-DD HH:mm:ss## format. | ||
| 27 | |||
| 28 | {{code language="xml"}} | ||
| 29 | <?xml version="1.0" encoding="UTF-8"?> | ||
| 30 | <hibernate-generic datetime="2013-10-14 16:05:52"> | ||
| 31 | |||
| 32 | {{/code}} | ||
| 33 | |||
| 34 | The root hibernate-generic contains object nodes all sorts of objects representing what is in a Confluence instance | ||
| 35 | |||
| 36 | === Common concepts === | ||
| 37 | |||
| 38 | Here is what an object looks like, with its usual indentation as it appears in a typical entities.xml file from Confluence server: | ||
| 39 | |||
| 40 | {{code language="xml"}} | ||
| 41 | <object class="Page" package="com.atlassian.confluence.pages"> | ||
| 42 | <id name="id">753689</id> | ||
| 43 | <property name="position"/><collection name="children" class="java.util.Collection"><element class="Page" package="com.atlassian.confluence.pages"><id name="id">753692</id> | ||
| 44 | </element> | ||
| 45 | </collection> | ||
| 46 | <property name="space" class="Space" package="com.atlassian.confluence.spaces"><id name="id">786435</id> | ||
| 47 | </property> | ||
| 48 | <property name="title"><![CDATA[privatespace Home]]></property> | ||
| 49 | <collection name="bodyContents" class="java.util.Collection"><element class="BodyContent" package="com.atlassian.confluence.core"><id name="id">819224</id> | ||
| 50 | </element> | ||
| 51 | </collection> | ||
| 52 | <property name="version">1</property> | ||
| 53 | <property name="creatorName"><![CDATA[admin]]></property> | ||
| 54 | <property name="creationDate">2013-10-14 15:37:24.463</property> | ||
| 55 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 56 | <property name="lastModificationDate">2013-10-14 15:37:24.463</property> | ||
| 57 | <property name="versionComment"><![CDATA[]]></property> | ||
| 58 | <property name="contentStatus"><![CDATA[current]]></property> | ||
| 59 | <collection name="comments" class="java.util.Collection"><element class="Comment" package="com.atlassian.confluence.pages"><id name="id">753690</id> | ||
| 60 | </element> | ||
| 61 | </collection> | ||
| 62 | </object> | ||
| 63 | {{/code}} | ||
| 64 | |||
| 65 | An object has a **type** defined by its **class** and **package attributes**. In theory, the package attribute cannot be ignored. In practice, it can. | ||
| 66 | |||
| 67 | it has an id (defined with an ##id## node), and properties (defined with ##property## and ##collection## nodes) which depends on the type of the object. | ||
| 68 | |||
| 69 | We list the 3 node types that can appear in a object node. | ||
| 70 | |||
| 71 | ==== ##id## nodes ==== | ||
| 72 | |||
| 73 | This node defines the object unique identifier, which we will call //id// in the rest of this document. It appears exactly once per object. Although it probably cannot be relied upon, we've always seen appear as the first child node of the object. it usually has a name attribute with the value ##"id"##, except for ConfluenceUserImpl objects where it has the value "key". | ||
| 74 | |||
| 75 | ==== ##property## nodes ==== | ||
| 76 | |||
| 77 | This node defines a property with a primitive type. | ||
| 78 | |||
| 79 | The name of the property is given by the ##name## attribute. | ||
| 80 | |||
| 81 | Dates follow the ##YYYY-MM-DD HH:mm:ss.xxx## format. | ||
| 82 | |||
| 83 | Strings (including enum values) are (apparently always) in a CDATA section, and numbers and dates are not. | ||
| 84 | |||
| 85 | ==== ##collection## nodes ==== | ||
| 86 | |||
| 87 | This node defines a property with a value that is a collection of objects. | ||
| 88 | |||
| 89 | The name of the property is given by the ##name## attribute. The Java type of the collection is given by the class attribute. It can be an interface or a concrete type. From what we have seen, collections always contain objects ids that are put in ##<id name="id">## nodes, themselves each put inside ##element## nodes having a ##class## and a ##package## attributes that are equal to those of the node of the pointed objects. Said differently, ##collection## nodes contain ##element## nodes, each one containing exactly one ##id## node that contains the id of an object. | ||
| 90 | |||
| 91 | {{warning}} | ||
| 92 | The type advertised by the class attribute doesn't lie: **you usually cannot rely on any order**. There's no guarantee a java.util.Collection is ordered. The order in which collections are output probably depends on whatever the database engine used by Confluence returned and then on whatever the actual implementation of java.util.Collection used by Confluence returns elements when iterated over. | ||
| 93 | \\For example, attachments or child documents are not (necessarily) ordered by date or by version number. When parsing a collection, if you need a certain order, you need to implement a sort method that uses dates, or versions, or more likely a combination of both because it happens that one of the fields is missing) stored in properties of the pointed objects. | ||
| 94 | |||
| 95 | See for instance: | ||
| 96 | |||
| 97 | * [[https:~~/~~/jira.xwiki.org/browse/CONFLUENCE-405>>https://jira.xwiki.org/browse/CONFLUENCE-405]] | ||
| 98 | * [[https:~~/~~/jira.xwiki.org/browse/CONFLUENCE-415>>https://jira.xwiki.org/browse/CONFLUENCE-415]] | ||
| 99 | * [[https:~~/~~/jira.xwiki.org/browse/CONFLUENCE-416>>https://jira.xwiki.org/browse/CONFLUENCE-416]] | ||
| 100 | {{/warning}} | ||
| 101 | |||
| 102 | ==== Notes on order, ids, relationship and how many things can(not) be relied upon ==== | ||
| 103 | |||
| 104 | * **Objects appear in no particular order** that could be relied upon, or so it seems. It is very well possible that an object references another that has not yet been dumped. | ||
| 105 | * **Each object has a unique id** although we don't currently rely on this in filter module (but this is not a promise) (e.g. we have not seen a Page object having the same id as a User object) | ||
| 106 | * **The id order cannot be relied upon.** An older object can have a greater id. We believe this can happen because some import / backup restore mechanism at Confluence doesn't preserve the ids (the handling of ids might be left to the database engine, and since they are dumped in backups in no particular order, they are not created in the database in the same order as before the backup, or something like this). | ||
| 107 | * There is **some unreliable duplication in how objects declare their relationship**, and in particular their parents and children, and sometimes their ancestors. Usually, everything is there but we've noticed** this cannot be relied upon**. Sometimes, one way is missing for some reason. For this reason, one needs to implement both ways when parsing the export package. | ||
| 108 | |||
| 109 | ==== Referencing a user ==== | ||
| 110 | |||
| 111 | Objects are usually referenced using their id. For users, we found 3 ways it is done: | ||
| 112 | |||
| 113 | * The InternalUser id, which is a regular number | ||
| 114 | * The ConfluenceUserImpl id (what we call the "user key"), which appears to be an hexadecimal string | ||
| 115 | * The user name, found in the name property of ConfluenceUserImpl and of InternalUser objects | ||
| 116 | |||
| 117 | ==== Usual creation and modification properties ==== | ||
| 118 | |||
| 119 | Many object types have the following properties in common. We describe them here once to avoid repetition. | ||
| 120 | |||
| 121 | * creatorName: the username of the creator of the page. Used by older versions of Confluence. See also creator, which is used by newer versions. In the general case, you'll have to check both properties. | ||
| 122 | * creator: the user //key// of the creator of the page (and not of the revision!), which you can turn into a username using the corresponding ConfluenceUserImpl object. Used by more recent versions of Confluence. See also creatorName for the property used by older versions. In the general case, you'll have to check both properties. See also lastModifier and lastModifierName for the user who created this specific revision. | ||
| 123 | * creationDate: the creation date of the first revision of the page. See also lastModificationDate. | ||
| 124 | * lastModificationDate: the creation date of this specific page revision | ||
| 125 | * lastModifierName: the username of the user who created this revision. Used by older versions of Confluence. See also lastModifier, used by more recent versions of Confluence. In the general case, you'll have to check both properties. See also creator and creatorName for the user who created the first revision of this page. | ||
| 126 | * lastModifier: the user //key// of the user who created this revision. Used by more recent versions of Confluence. See also lastModifierName, used by older versions of Confluence. In the general case, you'll have to check both properties. See also creator and creatorName for the user who created the first revision of this page. | ||
| 127 | |||
| 128 | === Known object types === | ||
| 129 | |||
| 130 | ==== ##bucket.user.propertyset.BucketPropertySetItem## ==== | ||
| 131 | |||
| 132 | {{code language="xml"}} | ||
| 133 | <object class="BucketPropertySetItem" package="bucket.user.propertyset"> | ||
| 134 | <composite-id><property name="entityName" type="string"><![CDATA[CWD_admin]]></property> | ||
| 135 | <property name="entityId" type="long">0</property> | ||
| 136 | <property name="key" type="string"><![CDATA[confluence.user.runtime.recent-changes.size]]></property> | ||
| 137 | </composite-id> | ||
| 138 | <property name="type">2</property> | ||
| 139 | <property name="booleanVal">false</property> | ||
| 140 | <property name="doubleVal">0.0</property> | ||
| 141 | <property name="stringVal"/><property name="textVal"/><property name="longVal">0</property> | ||
| 142 | <property name="intVal">30</property> | ||
| 143 | <property name="dateVal"/></object> | ||
| 144 | |||
| 145 | {{/code}} | ||
| 146 | |||
| 147 | We don't yet use these objects. See [[https:~~/~~/docs.atlassian.com/ConfluenceServer/javadoc/8.2.0-m27/bucket/user/propertyset/BucketPropertySetItem.html>>https://docs.atlassian.com/ConfluenceServer/javadoc/8.2.0-m27/bucket/user/propertyset/BucketPropertySetItem.html]] | ||
| 148 | |||
| 149 | ==== ##com.atlassian.confluence.core.BodyContent## ==== | ||
| 150 | |||
| 151 | {{code language="xml"}} | ||
| 152 | <object class="BodyContent" package="com.atlassian.confluence.core"> | ||
| 153 | <id name="id">819222</id> | ||
| 154 | <property name="body"><![CDATA[<p>Comment on homepage of space 2</p>]]></property> | ||
| 155 | <property name="content" class="Comment" package="com.atlassian.confluence.pages"><id name="id">753687</id> | ||
| 156 | </property> | ||
| 157 | <property name="bodyType">2</property> | ||
| 158 | </object> | ||
| 159 | |||
| 160 | {{/code}} | ||
| 161 | |||
| 162 | The content of a comment or a page. | ||
| 163 | |||
| 164 | The ##body## property contains the content in a CDATA section, in the syntax defined in the ##bodyType## property. Here are the body types we know about: | ||
| 165 | |||
| 166 | * 0: this is the old Confluence wiki syntax (the default) | ||
| 167 | * 1: this is raw character data | ||
| 168 | * 2: this is the XHTML storage format. | ||
| 169 | |||
| 170 | See also [[https://docs.atlassian.com/atlassian-confluence/6.6.0/com/atlassian/confluence/core/BodyType.html]] | ||
| 171 | |||
| 172 | The ##content## property refers to the object this ##BodyContent## object describes the body of and works like the ##element## nodes, with the class and ##package## attributes and the ##id name="id"## child. Here are the two types of content having body contents we know about: | ||
| 173 | |||
| 174 | * ##com.atlassian.confluence.pages.Comment## | ||
| 175 | * ##com.atlassian.confluence.pages.Page## | ||
| 176 | |||
| 177 | {{warning}} | ||
| 178 | The Confluence XHTML Syntax is an XML dialect that can contain CDATA sections. Since it's already stored in a CDATA section in the content property, their trick is to add a space to the CDATA end tag. ##]]>## becomes ##]] >##. You will have to pre-process this before parsing this value. | ||
| 179 | {{/warning}} | ||
| 180 | |||
| 181 | ==== ##com.atlassian.confluence.links.OutgoingLink## ==== | ||
| 182 | |||
| 183 | {{code language="xml"}} | ||
| 184 | <object class="OutgoingLink" package="com.atlassian.confluence.links"> | ||
| 185 | <id name="id">950286</id> | ||
| 186 | <property name="destinationPageTitle"><![CDATA[space1 Home]]></property> | ||
| 187 | <property name="destinationSpaceKey"><![CDATA[SPACE1]]></property> | ||
| 188 | <property name="sourceContent" class="Page" package="com.atlassian.confluence.pages"><id name="id">753668</id> | ||
| 189 | </property> | ||
| 190 | <property name="creatorName"><![CDATA[admin]]></property> | ||
| 191 | <property name="creationDate">2013-10-14 15:34:06.814</property> | ||
| 192 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 193 | <property name="lastModificationDate">2013-10-14 15:34:06.814</property> | ||
| 194 | </object> | ||
| 195 | |||
| 196 | {{/code}} | ||
| 197 | |||
| 198 | Describes an outgoing link. We haven't used those so far. ##OutgoingLink## objects have the usual creation and modification properties. | ||
| 199 | |||
| 200 | ==== ##com.atlassian.confluence.mail.notification.Notification## ==== | ||
| 201 | |||
| 202 | {{code language="xml"}} | ||
| 203 | <object class="Notification" package="com.atlassian.confluence.mail.notification"> | ||
| 204 | <id name="id">983041</id> | ||
| 205 | <property name="page" class="Page" package="com.atlassian.confluence.pages"><id name="id">753668</id> | ||
| 206 | </property> | ||
| 207 | <property name="userName"><![CDATA[admin]]></property> | ||
| 208 | <property name="creatorName"><![CDATA[admin]]></property> | ||
| 209 | <property name="creationDate">2013-10-14 15:07:38.873</property> | ||
| 210 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 211 | <property name="lastModificationDate">2013-10-14 15:07:38.873</property> | ||
| 212 | <property name="digest">false</property> | ||
| 213 | <property name="network">false</property> | ||
| 214 | <property name="type" enum-class="ContentTypeEnum" package="com.atlassian.confluence.search.service"/></object> | ||
| 215 | |||
| 216 | {{/code}} | ||
| 217 | |||
| 218 | A notification ("watch") setting. We don't yet use these objects. See [[https:~~/~~/docs.atlassian.com/atlassian-confluence/5.10.8/com/atlassian/confluence/mail/notification/Notification.html>>https://docs.atlassian.com/atlassian-confluence/5.10.8/com/atlassian/confluence/mail/notification/Notification.html]] | ||
| 219 | |||
| 220 | ==== ##com.atlassian.confluence.pages.Attachment## ==== | ||
| 221 | |||
| 222 | {{code language="xml"}} | ||
| 223 | <object class="Attachment" package="com.atlassian.confluence.pages"> | ||
| 224 | <id name="id">884739</id> | ||
| 225 | <property name="fileName"><![CDATA[Config.xml]]></property> | ||
| 226 | <property name="contentType"><![CDATA[text/xml]]></property> | ||
| 227 | <property name="content" class="Page" package="com.atlassian.confluence.pages"><id name="id">753668</id> | ||
| 228 | </property> | ||
| 229 | <property name="creatorName"><![CDATA[admin]]></property> | ||
| 230 | <property name="creationDate">2013-10-14 15:05:29.969</property> | ||
| 231 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 232 | <property name="lastModificationDate">2013-10-14 15:07:38.630</property> | ||
| 233 | <property name="fileSize">308</property> | ||
| 234 | <property name="comment"/><property name="attachmentVersion">1</property> | ||
| 235 | </object> | ||
| 236 | {{/code}} | ||
| 237 | |||
| 238 | Describes an attachment. The structure looks quite like that of #Page# objects. See the ##Page## section for the following fields: ##navigationType##, ##contentStatus##, ##space##. ##Attachment## objects also have the usual creation and modification properties. | ||
| 239 | |||
| 240 | Specific fields: | ||
| 241 | |||
| 242 | * ##title##: the file name of the attachment in the wiki (not in the backup package). See also fileName. | ||
| 243 | * ##lowerTitle##: the lower case version of the title property. | ||
| 244 | * ##fileName##: the older name of the title property. | ||
| 245 | * ##contentType##: the mime type of the file. | ||
| 246 | * ##content##: an element-like property pointing to the content containing this attachment. See also containerContent. | ||
| 247 | * ##containerContent##: the former name (?) of the content property. | ||
| 248 | * ##fileSize##: the size of the file, in bytes. | ||
| 249 | * ##comment##: the user comment attached to this version of this attachment (note: XWiki doesn't have an equivalent feature at the time of this was written) | ||
| 250 | ** ##attachmentVersion##: a increasing number giving the revision number of this attachment. It's supposed to be unique per attachment. See also version. | ||
| 251 | * ##version##: another name for ##attachmentVersion##. Supposedly the new name of the property. | ||
| 252 | * ##originalVersion##: an element-like value pointing to the last revision of the attachment. See also ##originalVersionId## | ||
| 253 | * ##originalVersionId##: a number version of the original version property. Sometimes this property is used instead of ##originalVersion##. It is unclear when. This property can be present and empty. In this case, it should be analyzed as if it were not present at all. | ||
| 254 | * ##historicalVersions##: the older revisions of an attachment. | ||
| 255 | * ##imageDetailsDTO##: ??? | ||
| 256 | |||
| 257 | {{info}} | ||
| 258 | The actual files are only there on space exports if the attachment export was not disabled. In this case, they are in the attachments folder (see the next section). | ||
| 259 | {{/info}} | ||
| 260 | |||
| 261 | {{info}} | ||
| 262 | In practice, some kinds of corruptions may require you to sort attachments by both the version and the modification date, and detect and remove duplicates. | ||
| 263 | {{/info}} | ||
| 264 | |||
| 265 | |||
| 266 | ==== ##com.atlassian.confluence.pages.Page## ==== | ||
| 267 | |||
| 268 | {{code language="xml"}} | ||
| 269 | <object class="Page" package="com.atlassian.confluence.pages"> | ||
| 270 | <id name="id">753692</id> | ||
| 271 | <property name="position"/><property name="parent" class="Page" package="com.atlassian.confluence.pages"><id name="id">753689</id> | ||
| 272 | </property> | ||
| 273 | <collection name="ancestors" class="java.util.List"><element class="Page" package="com.atlassian.confluence.pages"><id name="id">753689</id> | ||
| 274 | </element> | ||
| 275 | </collection> | ||
| 276 | <property name="space" class="Space" package="com.atlassian.confluence.spaces"><id name="id">786435</id> | ||
| 277 | </property> | ||
| 278 | <property name="title"><![CDATA[Private page 1]]></property> | ||
| 279 | <collection name="bodyContents" class="java.util.Collection"><element class="BodyContent" package="com.atlassian.confluence.core"><id name="id">819226</id> | ||
| 280 | </element> | ||
| 281 | </collection> | ||
| 282 | <property name="version">1</property> | ||
| 283 | <property name="creatorName"><![CDATA[admin]]></property> | ||
| 284 | <property name="creationDate">2013-10-14 15:37:52.357</property> | ||
| 285 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 286 | <property name="lastModificationDate">2013-10-14 15:37:52.357</property> | ||
| 287 | <property name="versionComment"><![CDATA[]]></property> | ||
| 288 | <property name="contentStatus"><![CDATA[current]]></property> | ||
| 289 | </object> | ||
| 290 | {{/code}} | ||
| 291 | |||
| 292 | {{code language="xml" title="The historicalVersions property of another page"}} | ||
| 293 | <collection name="historicalVersions" class="java.util.Collection"><element class="Page" package="com.atlassian.confluence.pages"><id name="id">753670</id> | ||
| 294 | </element> | ||
| 295 | <element class="Page" package="com.atlassian.confluence.pages"><id name="id">753675</id> | ||
| 296 | </element> | ||
| 297 | <element class="Page" package="com.atlassian.confluence.pages"><id name="id">753676</id> | ||
| 298 | </element> | ||
| 299 | <element class="Page" package="com.atlassian.confluence.pages"><id name="id">753677</id> | ||
| 300 | </element> | ||
| 301 | <element class="Page" package="com.atlassian.confluence.pages"><id name="id">753678</id> | ||
| 302 | </element> | ||
| 303 | <element class="Page" package="com.atlassian.confluence.pages"><id name="id">753684</id> | ||
| 304 | </element> | ||
| 305 | </collection> | ||
| 306 | {{/code}} | ||
| 307 | |||
| 308 | {{code language="xml" title="The originalVersion property of another page"}} | ||
| 309 | <property name="originalVersion" class="Page" package="com.atlassian.confluence.pages"><id name="id">753668</id> | ||
| 310 | </property> | ||
| 311 | {{/code}} | ||
| 312 | |||
| 313 | This describes a Confluence page **revision**. Here are the properties we know about: | ||
| 314 | |||
| 315 | * position: an integer giving its position in the navigation menu of the space. See [[https://jira.xwiki.org/browse/CONFLUENCE-261]] | ||
| 316 | * ancestors: a collection of ids referring to the parents of the page up to but excluding the space: its direct parent, the direct parent of its direct parent, and so on and so forth. We have not been relying on this property. | ||
| 317 | * space: the id of the Space object describing the space in which the Page is | ||
| 318 | * title: the title of the page, which is supposed to be unique in the whole space | ||
| 319 | * lowerTitle: the lowercase version of the title, also supposed to be unique in the whole space | ||
| 320 | * bodyContents: a collection that contains the id of the object describing the content of the page. This property is a collection but we have ever seen exactly one element in this collection. It is unclear why a collection is used here. | ||
| 321 | * version: a number which is the revision number of the page. It is supposed to be unique across a page and its historical revisions. In practice, we've seen duplicate versions in some exports, not clear where it comes from, probably some sort of corruption. See for instance [[https://jira.xwiki.org/browse/CONFLUENCE-427]] | ||
| 322 | * versionComment: a comment written by the user as save time to describe this version | ||
| 323 | * contentStatus: contains the status of this Page. Here are the known values: | ||
| 324 | ** current: the page is current | ||
| 325 | ** draft: the page is a draft. We currently discard these pages. | ||
| 326 | ** deleted: the page was deleted. We currently discard these pages. | ||
| 327 | * originalVersion: this property is set only on historical versions of the page, and points to the last version of the page. Only older revisions have this property, the last revision doesn't have it and that's how you know a Page object describes the last revision of a Page. See also originalVersionId. | ||
| 328 | * originalVersionId: like originalVersion, used by older Confluence versions, directly a number instead of an element-like value. | ||
| 329 | * navigationType: ??? | ||
| 330 | * historicalVersions: a //unordered //collection of the older revisions of the page. Only the last revision of the page has this property. | ||
| 331 | * children or childrens: an //unordered// collection of the last versions of the direct children of the page (note: it's sometimes children//s// with an //s// at the end, sometimes children without the //s//) | ||
| 332 | * attachments: an //unordered// collection of the attachments, including their older versions | ||
| 333 | * comments: an unordered collections of comments | ||
| 334 | * outgoingLinks: an unordered collection of outgoing links | ||
| 335 | * contentPermissionSets: a collection of permission sets of type ContentPermissionSet, which are sets of permissions applying to this content | ||
| 336 | |||
| 337 | Page objects also have the usual creation and modification properties. | ||
| 338 | |||
| 339 | {{info}} | ||
| 340 | A page that doesn't have several revisions don't have an originalVersion property nor a historicalVersions property. | ||
| 341 | {{/info}} | ||
| 342 | |||
| 343 | {{info}} | ||
| 344 | The id of the last revision of a page is stable. In particular, a page that gets an additional revision still keeps this id because it it always the id of its last revision. That's what we call //stable id// or //stableId// in the XWiki Confluence project. | ||
| 345 | \\Our understanding is that when a revision is added, it is like if the following were happening: | ||
| 346 | |||
| 347 | 1. Object N of the last revision is copied to a new object M | ||
| 348 | 1. Object M now describes the former last revision | ||
| 349 | 1. Object M gets a originalVersion property pointing to object N | ||
| 350 | 1. The historicalVersions property is removed from object M | ||
| 351 | 1. M is added to the historicalVersions property of object N | ||
| 352 | |||
| 353 | Page revisions appear to have their ids changed when they stop being the last revisions (again, so the page can keep its "original" id). | ||
| 354 | |||
| 355 | This is likely why you can notice that ids in the historicalVersions property are often higher than the id of the "original" object itself. But remember: ⚠ **this is not guaranteed!** | ||
| 356 | {{/info}} | ||
| 357 | |||
| 358 | {{warning}} | ||
| 359 | Be careful: the term //original// may feel backwards. The first (oldest) version of a page is not the original version. It //was// the original version once in its life: when it was the current revision. The original version is the //current// version. | ||
| 360 | It makes sense if you think of "original" as referring to the "original" object that was copied each time a revision was created, and that kept the "original" id of the page. | ||
| 361 | {{/warning}} | ||
| 362 | |||
| 363 | ==== ##com.atlassian.confluence.security.ContentPermission## ==== | ||
| 364 | |||
| 365 | {{code language="xml"}} | ||
| 366 | <object class="ContentPermission" package="com.atlassian.confluence.security"> | ||
| 367 | <id name="id">1048577</id> | ||
| 368 | <property name="type"><![CDATA[View]]></property> | ||
| 369 | <property name="userName"><![CDATA[admin]]></property> | ||
| 370 | <property name="groupName"/><property name="owningSet" class="ContentPermissionSet" package="com.atlassian.confluence.security"><id name="id">1015809</id> | ||
| 371 | </property> | ||
| 372 | <property name="creatorName"/><property name="creationDate">2013-10-14 15:41:26.893</property> | ||
| 373 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 374 | <property name="lastModificationDate">2013-10-14 15:41:26.893</property> | ||
| 375 | </object> | ||
| 376 | {{/code}} | ||
| 377 | |||
| 378 | Other properties: | ||
| 379 | |||
| 380 | * type: the name of the permission of all the content permissions in this set. Note: The content permissions themselves also have a type property, with the same value. | ||
| 381 | * ##owningSet: an element-like value pointing to the content permission set in which this content permission is## | ||
| 382 | * ##userName##: the name of the user to which the permission applies. | ||
| 383 | * ##groupName##: the name of the group to which the permission applies. | ||
| 384 | |||
| 385 | ##ContentPermission## objects also have the usual creation and modification properties. | ||
| 386 | |||
| 387 | ==== ##com.atlassian.confluence.security.ContentPermissionSet## ==== | ||
| 388 | |||
| 389 | {{code language="xml"}} | ||
| 390 | <object class="ContentPermissionSet" package="com.atlassian.confluence.security"> | ||
| 391 | <id name="id">67043333</id> | ||
| 392 | <property name="type"><![CDATA[View]]></property> | ||
| 393 | <collection name="contentPermissions" class="java.util.SortedSet"> | ||
| 394 | <element class="ContentPermission" package="com.atlassian.confluence.security"> | ||
| 395 | <id name="id">67076114</id> | ||
| 396 | </element> | ||
| 397 | <!-- ... cut ... --> | ||
| 398 | <element class="ContentPermission" package="com.atlassian.confluence.security"> | ||
| 399 | <id name="id">152338667</id> | ||
| 400 | </element> | ||
| 401 | </collection> | ||
| 402 | <property name="owningContent" class="Page" package="com.atlassian.confluence.pages"> | ||
| 403 | <id name="id">66719934</id> | ||
| 404 | </property> | ||
| 405 | <property name="creationDate">2015-01-14 10:21:23.000</property> | ||
| 406 | <property name="lastModificationDate">2019-07-31 09:54:26.000</property> | ||
| 407 | </object> | ||
| 408 | {{/code}} | ||
| 409 | |||
| 410 | ##ContentPermissionSet## objects have the usual creation and modification properties. | ||
| 411 | |||
| 412 | Other properties: | ||
| 413 | |||
| 414 | * ##type##: the name of the permission of all the content permissions in this set. Note: The content permissions themselves also have a type property, with the same value. | ||
| 415 | * ##owningContent##: an element-like property pointint to the content to which the permissions of this content permission set applies | ||
| 416 | * ##contentPermissions##: a collection of content permissions | ||
| 417 | |||
| 418 | ==== ##com.atlassian.confluence.security.SpacePermission## ==== | ||
| 419 | |||
| 420 | {{code language="xml"}} | ||
| 421 | <object class="SpacePermission" package="com.atlassian.confluence.security"> | ||
| 422 | <id name="id">617742337</id> | ||
| 423 | <property name="space" class="Space" package="com.atlassian.confluence.spaces"> | ||
| 424 | <id name="id">622593</id> | ||
| 425 | </property> | ||
| 426 | <property name="type"><![CDATA[COMMENT]]></property> | ||
| 427 | <property name="group"/> | ||
| 428 | <property name="allUsersSubject"><![CDATA[anonymous-users]]></property> | ||
| 429 | <property name="creator" class="ConfluenceUserImpl" package="com.atlassian.confluence.user"> | ||
| 430 | <id name="key"><![CDATA[01f7c1ca483b2b1c01483b2d4f4206cc]]></id> | ||
| 431 | </property> | ||
| 432 | <property name="creationDate">2023-11-10 14:11:48.022</property> | ||
| 433 | <property name="lastModifier" class="ConfluenceUserImpl" package="com.atlassian.confluence.user"> | ||
| 434 | <id name="key"><![CDATA[01f7c1ca483b2b1c01483b2d4f4206cc]]></id> | ||
| 435 | </property> | ||
| 436 | <property name="lastModificationDate">2023-11-10 14:11:48.022</property> | ||
| 437 | </object> | ||
| 438 | |||
| 439 | {{/code}} | ||
| 440 | |||
| 441 | A space permission. | ||
| 442 | |||
| 443 | Other properties: | ||
| 444 | |||
| 445 | * type: the name of the permission | ||
| 446 | * space: the space which the space permission applies to | ||
| 447 | * group: the name of the group to which the permission apply. Empty if it doesn't apply to a group | ||
| 448 | * allUsersSubject: equals to anonymous-users if the permission applies to guests | ||
| 449 | * userSubject: an element-like value with a ##<id name="key">## containing the key of the user, described by a ConfluenceUserImpl object, to which the permission applies | ||
| 450 | |||
| 451 | SpacePermission objects have the usual creation and modification properties. | ||
| 452 | |||
| 453 | ==== ##com.atlassian.confluence.setup.bandana.ConfluenceBandanaRecord## ==== | ||
| 454 | |||
| 455 | {{code language="xml"}} | ||
| 456 | <object class="ConfluenceBandanaRecord" package="com.atlassian.confluence.setup.bandana"> | ||
| 457 | <id name="id">43</id> | ||
| 458 | <property name="context"><![CDATA[_GLOBAL]]></property> | ||
| 459 | <property name="key"><![CDATA[__DEFAULT_SPACE_PERMISSIONS____GROUP_NAMES__]]></property> | ||
| 460 | <property name="value"><![CDATA[<set> | ||
| 461 | <string>confluence-users</string> | ||
| 462 | </set>]]></property> | ||
| 463 | </object> | ||
| 464 | |||
| 465 | {{/code}} | ||
| 466 | |||
| 467 | We don't yet use these objects. | ||
| 468 | |||
| 469 | ==== ##com.atlassian.confluence.spaces.Space## ==== | ||
| 470 | |||
| 471 | {{code language="xml"}} | ||
| 472 | <object class="Space" package="com.atlassian.confluence.spaces"> | ||
| 473 | <id name="id">622593</id> | ||
| 474 | <property name="name"><![CDATA[Great Internal Documentation]]></property> | ||
| 475 | <property name="key"><![CDATA[Great]]></property> | ||
| 476 | <property name="lowerKey"><![CDATA[great]]></property> | ||
| 477 | <property name="description" class="SpaceDescription" package="com.atlassian.confluence.spaces"> | ||
| 478 | <id name="id">589825</id> | ||
| 479 | </property> | ||
| 480 | <property name="homePage" class="Page" package="com.atlassian.confluence.pages"> | ||
| 481 | <id name="id">589826</id> | ||
| 482 | </property> | ||
| 483 | <collection name="permissions" class="java.util.Collection"> | ||
| 484 | <element class="SpacePermission" package="com.atlassian.confluence.security"> | ||
| 485 | <id name="id">1277959</id> | ||
| 486 | </element> | ||
| 487 | <element class="SpacePermission" package="com.atlassian.confluence.security"> | ||
| 488 | <id name="id">1277960</id> | ||
| 489 | </element> | ||
| 490 | <!-- ... cut ... --> | ||
| 491 | <element class="SpacePermission" package="com.atlassian.confluence.security"> | ||
| 492 | <id name="id">617742337</id> | ||
| 493 | </element> | ||
| 494 | </collection> | ||
| 495 | <collection name="pageTemplates" class="java.util.Collection"> | ||
| 496 | <element class="PageTemplate" package="com.atlassian.confluence.pages.templates"> | ||
| 497 | <id name="id">110723073</id> | ||
| 498 | </element> | ||
| 499 | <element class="PageTemplate" package="com.atlassian.confluence.pages.templates"> | ||
| 500 | <id name="id">199458822</id> | ||
| 501 | </element> | ||
| 502 | </collection> | ||
| 503 | <property name="creator" class="ConfluenceUserImpl" package="com.atlassian.confluence.user"> | ||
| 504 | <id name="key"><![CDATA[01f7c1ca483b2b1c01483b2d4db002d4]]></id> | ||
| 505 | </property> | ||
| 506 | <property name="creationDate">2008-04-23 11:24:41.000</property> | ||
| 507 | <property name="lastModifier" class="ConfluenceUserImpl" package="com.atlassian.confluence.user"> | ||
| 508 | <id name="key"><![CDATA[01f7c1ca483b2b1c01483b2d4db002d4]]></id> | ||
| 509 | </property> | ||
| 510 | <property name="lastModificationDate">2009-07-10 10:16:53.000</property> | ||
| 511 | <property name="spaceType">global</property> | ||
| 512 | <property name="spaceStatus" enum-class="SpaceStatus" package="com.atlassian.confluence.spaces">CURRENT</property> | ||
| 513 | </object> | ||
| 514 | {{/code}} | ||
| 515 | |||
| 516 | A space. Its known properties are: | ||
| 517 | |||
| 518 | * name: the pretty name of the space | ||
| 519 | * key: the space key, supposed to be unique | ||
| 520 | * lowerKey: the lowercase version of the key, also supposed to be unique | ||
| 521 | * description: an element-like value pointing to the space description, which is usually shown in tables listing spaces. We currently drop this, except for the labels. (we used to import the space description as the home page of the space, now we import the home page of the space itself) | ||
| 522 | * homePage: an element-like value pointing to the space home page. Note: a space may not have any home page, in which case all pages are orphans. [[confluence-xml will, by default, issue a minimal home page listing its children>>https://jira.xwiki.org/browse/CONFLUENCE-460]]. | ||
| 523 | * permissions: a collection of space permissions (SpacePermission objects) | ||
| 524 | * pageTemplate: a collection of page templaces (PageTemplate objects) | ||
| 525 | * spaceType: whether the space is global or personal (See [[https:~~/~~/docs.atlassian.com/atlassian-confluence/1000.107.0/com/atlassian/confluence/spaces/SpaceType.html>>https://docs.atlassian.com/atlassian-confluence/1000.107.0/com/atlassian/confluence/spaces/SpaceType.html]]) | ||
| 526 | * spaceStatus: whether the space is current or archived (See [[https:~~/~~/docs.atlassian.com/atlassian-confluence/1000.107.0/com/atlassian/confluence/spaces/SpaceStatus.html>>https://docs.atlassian.com/atlassian-confluence/1000.107.0/com/atlassian/confluence/spaces/SpaceStatus.html]]) | ||
| 527 | |||
| 528 | Space objects have the usual creation and modification properties. | ||
| 529 | |||
| 530 | ==== ##com.atlassian.confluence.spaces##.##SpaceDescription## ==== | ||
| 531 | |||
| 532 | {{code language="xml"}} | ||
| 533 | <object class="SpaceDescription" package="com.atlassian.confluence.spaces"> | ||
| 534 | <id name="id">753665</id> | ||
| 535 | <property name="space" class="Space" package="com.atlassian.confluence.spaces"><id name="id">786433</id> | ||
| 536 | </property> | ||
| 537 | <property name="title"/><collection name="bodyContents" class="java.util.Collection"><element class="BodyContent" package="com.atlassian.confluence.core"><id name="id">819201</id> | ||
| 538 | </element> | ||
| 539 | </collection> | ||
| 540 | <property name="version">1</property> | ||
| 541 | <property name="creatorName"><![CDATA[admin]]></property> | ||
| 542 | <property name="creationDate">2013-10-14 14:53:25.489</property> | ||
| 543 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 544 | <property name="lastModificationDate">2013-10-14 14:53:25.489</property> | ||
| 545 | <property name="versionComment"><![CDATA[]]></property> | ||
| 546 | <property name="contentStatus"><![CDATA[current]]></property> | ||
| 547 | <collection name="labellings" class="java.util.Collection"><element class="Labelling" package="com.atlassian.confluence.labels"><id name="id">720901</id> | ||
| 548 | </element> | ||
| 549 | </collection> | ||
| 550 | </object> | ||
| 551 | {{/code}} | ||
| 552 | |||
| 553 | The description of a space. For a description of its properties, see Page (the two types of objects are very similar). SpaceDescription objects have the usual creation and modification properties. | ||
| 554 | |||
| 555 | ==== ##com.atlassian.confluence.user.ConfluenceUserImpl## ==== | ||
| 556 | |||
| 557 | {{code language="xml"}} | ||
| 558 | <object class="ConfluenceUserImpl" package="com.atlassian.confluence.user"> | ||
| 559 | <id name="key"><![CDATA[01f7c1cc638e0d8c0163d05ca6f60124]]></id> | ||
| 560 | <property name="name"><![CDATA[47826731]]></property> | ||
| 561 | <property name="lowerName"><![CDATA[47826731]]></property> | ||
| 562 | <property name="email"/> | ||
| 563 | </object> | ||
| 564 | {{/code}} | ||
| 565 | |||
| 566 | An object that represents a user. | ||
| 567 | |||
| 568 | {{info}} | ||
| 569 | The id is not a number but the user key and the name attribute of the id tag is "key" | ||
| 570 | {{/info}} | ||
| 571 | |||
| 572 | * ##name##: the name of the user | ||
| 573 | * ##lowerName##: the lowercase version of the name of the user | ||
| 574 | * ##email##: the email address of the user (optional) | ||
| 575 | |||
| 576 | ==== ##com.atlassian.confluence.user.persistence.dao.ConfluenceRememberMeToken## ==== | ||
| 577 | |||
| 578 | {{code language="xml"}} | ||
| 579 | <object class="ConfluenceRememberMeToken" package="com.atlassian.confluence.user.persistence.dao"> | ||
| 580 | <id name="id">393217</id> | ||
| 581 | <property name="username"><![CDATA[admin]]></property> | ||
| 582 | <property name="createdTime">1381745929067</property> | ||
| 583 | <property name="token"><![CDATA[251b5b4649888218a9c81ddf30b66029b63f83d5]]></property> | ||
| 584 | </object> | ||
| 585 | {{/code}} | ||
| 586 | |||
| 587 | We don't use these objects. | ||
| 588 | |||
| 589 | ==== ##com.atlassian.confluence.users.PersonalInformation## ==== | ||
| 590 | |||
| 591 | {{code language="xml"}} | ||
| 592 | <object class="PersonalInformation" package="com.atlassian.confluence.user"> | ||
| 593 | <id name="id">753694</id> | ||
| 594 | <property name="username"><![CDATA[user1]]></property> | ||
| 595 | <property name="title"/><property name="version">1</property> | ||
| 596 | <property name="creatorName"><![CDATA[admin]]></property> | ||
| 597 | <property name="creationDate">2013-10-14 15:42:39.535</property> | ||
| 598 | <property name="lastModifierName"><![CDATA[admin]]></property> | ||
| 599 | <property name="lastModificationDate">2013-10-14 15:42:39.535</property> | ||
| 600 | <property name="versionComment"><![CDATA[]]></property> | ||
| 601 | <property name="contentStatus"><![CDATA[current]]></property> | ||
| 602 | </object> | ||
| 603 | |||
| 604 | {{/code}} | ||
| 605 | |||
| 606 | We don't yet use these objects (which seem to have the usual creation and modification properties). See [[https:~~/~~/docs.atlassian.com/atlassian-confluence/6.6.0/com/atlassian/confluence/user/PersonalInformation.html>>https://docs.atlassian.com/atlassian-confluence/6.6.0/com/atlassian/confluence/user/PersonalInformation.html]] | ||
| 607 | |||
| 608 | ==== ##com.atlassian.crowd.embedded.hibernate2.HibernateMembership## ==== | ||
| 609 | |||
| 610 | {{code language="xml"}} | ||
| 611 | <object class="HibernateMembership" package="com.atlassian.crowd.embedded.hibernate2"> | ||
| 612 | <id name="id">294915</id> | ||
| 613 | <property name="parentGroup" class="InternalGroup" package="com.atlassian.crowd.model.group"><id name="id">163842</id> | ||
| 614 | </property> | ||
| 615 | <property name="userMember" class="InternalUser" package="com.atlassian.crowd.model.user"><id name="id">229378</id> | ||
| 616 | </property> | ||
| 617 | </object> | ||
| 618 | |||
| 619 | {{/code}} | ||
| 620 | |||
| 621 | We don't yet use these objects. | ||
| 622 | |||
| 623 | ==== ##com.atlassian.crowd.model.application.DirectoryMapping## ==== | ||
| 624 | |||
| 625 | {{code language="xml"}} | ||
| 626 | <object class="DirectoryMapping" package="com.atlassian.crowd.model.application"> | ||
| 627 | <id name="id">131073</id> | ||
| 628 | <property name="application" class="ApplicationImpl" package="com.atlassian.crowd.model.application"><id name="id">65537</id> | ||
| 629 | </property> | ||
| 630 | <property name="directory" class="DirectoryImpl" package="com.atlassian.crowd.model.directory"><id name="id">98305</id> | ||
| 631 | </property> | ||
| 632 | <property name="allowAllToAuthenticate">true</property> | ||
| 633 | <collection name="allowedOperations" class="java.util.Set"><element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">UPDATE_ROLE_ATTRIBUTE</element> | ||
| 634 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">DELETE_USER</element> | ||
| 635 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">UPDATE_USER_ATTRIBUTE</element> | ||
| 636 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">CREATE_USER</element> | ||
| 637 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">DELETE_ROLE</element> | ||
| 638 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">CREATE_ROLE</element> | ||
| 639 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">CREATE_GROUP</element> | ||
| 640 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">UPDATE_USER</element> | ||
| 641 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">UPDATE_GROUP_ATTRIBUTE</element> | ||
| 642 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">UPDATE_GROUP</element> | ||
| 643 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">DELETE_GROUP</element> | ||
| 644 | <element enum-class="OperationType" package="com.atlassian.crowd.embedded.api">UPDATE_ROLE</element> | ||
| 645 | </collection> | ||
| 646 | </object> | ||
| 647 | |||
| 648 | {{/code}} | ||
| 649 | |||
| 650 | We don't use these objects. | ||
| 651 | |||
| 652 | ==== ##com.atlassian.crowd.model.group.InternalUser## ==== | ||
| 653 | |||
| 654 | {{code language="xml"}} | ||
| 655 | <object class="InternalUser" package="com.atlassian.crowd.model.user"> | ||
| 656 | <id name="id">163842</id> | ||
| 657 | <property name="name"><![CDATA[UserName]]></property> | ||
| 658 | <property name="lowerName"><![CDATA[username]]></property> | ||
| 659 | <property name="active">true</property> | ||
| 660 | <property name="createdDate">2016-05-10 15:00:02.760</property> | ||
| 661 | <property name="updatedDate">2016-05-10 15:00:02.760</property> | ||
| 662 | <property name="firstName"><![CDATA[User]]></property> | ||
| 663 | <property name="lowerFirstName"><![CDATA[user]]></property> | ||
| 664 | <property name="lastName"><![CDATA[Name]]></property> | ||
| 665 | <property name="lowerLastName"><![CDATA[name]]></property> | ||
| 666 | <property name="displayName"><![CDATA[User Name]]></property> | ||
| 667 | <property name="lowerDisplayName"><![CDATA[user name]]></property> | ||
| 668 | <property name="emailAddress"><![CDATA[[email protected]]]></property> | ||
| 669 | <property name="lowerEmailAddress"><![CDATA[[email protected]]]></property> | ||
| 670 | </object> | ||
| 671 | {{/code}} | ||
| 672 | |||
| 673 | These objects only seem to be in Site backups, not space exports. They describe users supposedly registered directly in Confluence. | ||
| 674 | |||
| 675 | ==== ##com.atlassian.crowd.model.group.InternalGroup## ==== | ||
| 676 | |||
| 677 | {{code language="xml"}} | ||
| 678 | <object class="InternalGroup" package="com.atlassian.crowd.model.group"> | ||
| 679 | <id name="id">163843</id> | ||
| 680 | <property name="name"><![CDATA[twistedgroup]]></property> | ||
| 681 | <property name="lowerName"><![CDATA[twistedgroup]]></property> | ||
| 682 | <property name="active">true</property> | ||
| 683 | <property name="local">false</property> | ||
| 684 | <property name="createdDate">2013-10-14 15:43:47.360</property> | ||
| 685 | <property name="updatedDate">2013-10-14 15:43:47.360</property> | ||
| 686 | <property name="description"/><property name="type" enum-class="GroupType" package="com.atlassian.crowd.model.group">GROUP</property> | ||
| 687 | <property name="directory" class="DirectoryImpl" package="com.atlassian.crowd.model.directory"><id name="id">98305</id> | ||
| 688 | </property> | ||
| 689 | </object> | ||
| 690 | {{/code}} | ||
| 691 | |||
| 692 | These objects only seem to be in Site backups, not space exports. They describe a group of users. They are used when group imports are enabled. It's usually better to import groups from a central user directory like LDAP. | ||
| 693 | |||
| 694 | ==== ##com.atlassian.crowd.model.user.InternalUserAttribute## ==== | ||
| 695 | |||
| 696 | {{code language="xml"}} | ||
| 697 | <object class="InternalUserAttribute" package="com.atlassian.crowd.model.user"> | ||
| 698 | <id name="id">262152</id> | ||
| 699 | <property name="user" class="InternalUser" package="com.atlassian.crowd.model.user"><id name="id">229379</id> | ||
| 700 | </property> | ||
| 701 | <property name="directory" class="DirectoryImpl" package="com.atlassian.crowd.model.directory"><id name="id">98305</id> | ||
| 702 | </property> | ||
| 703 | <property name="name"><![CDATA[passwordLastChanged]]></property> | ||
| 704 | <property name="value"><![CDATA[1381758208148]]></property> | ||
| 705 | <property name="lowerValue"><![CDATA[1381758208148]]></property> | ||
| 706 | </object> | ||
| 707 | {{/code}} | ||
| 708 | |||
| 709 | We don't yet use these objects. | ||
| 710 | |||
| 711 | == attachments folder == | ||
| 712 | |||
| 713 | The folder is in space exports if the attachment export was not disabled. To our knowledge, attachments are not present in site exports. In this case, attachment version ##v## (##version## or ##attachmentVersion## property), original id ##originalAttachmentId## of content originalC##ontentId## is expected to be at ##attachments/<originalContentId>/<originalAttachmentId>/<v>##. | ||
| 714 | |||
| 715 | == The export descriptor file (##exportDescriptor.properties##) == | ||
| 716 | |||
| 717 | The export descriptor file gives information about the export. It is a propeties file, each line containing a key=value pair. | ||
| 718 | It starts with a comment containing the date at which the export was done. | ||
| 719 | |||
| 720 | {{code language="none"}} | ||
| 721 | #Mon Feb 10 10:23:09 UTC 2025 | ||
| 722 | ao.data.version.min.com.atlassian.mywork.mywork-confluence-host-plugin=1.1.30 | ||
| 723 | ao.data.version.com.atlassian.mywork.mywork-confluence-host-plugin=1000.0.0-fa970f983392 | ||
| 724 | createdByVersionNumber=1000.0.0-fa970f983392 | ||
| 725 | source=cloud | ||
| 726 | buildNumber=4515 | ||
| 727 | ao.data.list=com.atlassian.mywork.mywork-confluence-host-plugin, com.atlassian.confluence.plugins.confluence-space-ia | ||
| 728 | spaceKey=attachhist | ||
| 729 | ao.data.version.min.com.atlassian.confluence.plugins.confluence-space-ia=5.0 | ||
| 730 | defaultUsersGroup=confluence-users | ||
| 731 | ao.data.version.com.atlassian.confluence.plugins.confluence-space-ia=1000.0.0-fa970f983392 | ||
| 732 | exportType=space | ||
| 733 | createdByBuildNumber=8401 | ||
| 734 | timezoneId=UTC | ||
| 735 | inlineTasksFileIncluded=true | ||
| 736 | backupAttachments=true | ||
| 737 | {{/code}} | ||
| 738 | |||
| 739 | Here are some interesting properties: | ||
| 740 | |||
| 741 | * `source`: its value can be `server` or `cloud`. If it is absent, `server` can be assumed. It is interesting because Confluence Cloud and Confluence Server have significant differences. | ||
| 742 | * `exportType`: its value is `all` for a site backup, and `space` for a space backup. | ||
| 743 | * `spaceKey`: its value is the key of the exported spaces for space backups. Note: this can be used to handle [[https://jira.atlassian.com/browse/CONFSERVER-22853]], a bug in Confluence Server that makes it export several spaces when one was asked (see [[https://jira.xwiki.org/browse/CONFLUENCE-296]]) | ||
| 744 | * `backupAttachments`: whether exporting attachments was enabled for this backup. | ||
| 745 | |||
| 746 | == Quirks and how to handle them == | ||
| 747 | |||
| 748 | |||
| 749 | |||
| 750 | === The entities.xml can have a leading space in its name === | ||
| 751 | |||
| 752 | We've seen this sporadically. This makes the import fail fast. You'll need to remove the leading space from the filename. Should this happen more often, we shall add a workaround in Confluence XML. | ||
| 753 | |||
| 754 | === Space exports sometimes contain several spaces === | ||
| 755 | |||
| 756 | A bug in Confluence Server sometimes makes it export several spaces when one was asked ([[https://jira.atlassian.com/browse/CONFSERVER-22853]]. | ||
| 757 | |||
| 758 | We work around this issue in confluence-xml by not importing the extraneous space by default ([[https://jira.xwiki.org/browse/CONFLUENCE-296]]). | ||
| 759 | |||
| 760 | === Parsing entities.xml may not work out of the box because of the presence of control characters === | ||
| 761 | |||
| 762 | We fully work around this issue by transparently stripping them at parse time. See: | ||
| 763 | |||
| 764 | * https://jira.xwiki.org/browse/CONFLUENCE-143 (BS characters, and BS means backspace) | ||
| 765 | * https://jira.xwiki.org/browse/CONFLUENCE-181 (character U+0002) | ||
| 766 | |||
| 767 | === The export can be corrupted in ways the entities.xml file contains illegal XML characters in body contents === | ||
| 768 | |||
| 769 | XML parsers don't like weird non-unicode characters. With confluence-xml, you'll have a stack trace like this one: | ||
| 770 | |||
| 771 | {{code language="none"}} | ||
| 772 | 3/8/2025 5:56:26 PM Failed to read package | ||
| 773 | org.xwiki.filter.FilterException: Failed to analyze the package index | ||
| 774 | at org.xwiki.contrib.confluence.filter.input.ConfluenceXMLPackage.read(ConfluenceXMLPackage.java:893) | ||
| 775 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.preparePackage(ConfluenceInputFilterStream.java:391) | ||
| 776 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.readInternal(ConfluenceInputFilterStream.java:331) | ||
| 777 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.read(ConfluenceInputFilterStream.java:229) | ||
| 778 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.read(ConfluenceInputFilterStream.java:106) | ||
| 779 | at org.xwiki.filter.input.AbstractBeanInputFilterStream.read(AbstractBeanInputFilterStream.java:79) | ||
| 780 | at org.xwiki.filter.internal.job.FilterStreamConverterJob.runInternal(FilterStreamConverterJob.java:97) | ||
| 781 | at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:246) | ||
| 782 | at org.xwiki.job.AbstractJob.run(AbstractJob.java:223) | ||
| 783 | at org.xwiki.filter.script.internal.ScriptFilterStreamConverterJob.run(ScriptFilterStreamConverterJob.java:75) | ||
| 784 | at com.xwiki.confluencepro.internal.ConfluenceMigrationJob.runInternal(ConfluenceMigrationJob.java:159) | ||
| 785 | at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:246) | ||
| 786 | at org.xwiki.job.AbstractJob.run(AbstractJob.java:223) | ||
| 787 | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) | ||
| 788 | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) | ||
| 789 | at java.base/java.lang.Thread.run(Thread.java:840) | ||
| 790 | 3/8/2025 5:56:26 PM Exception thrown during job execution | ||
| 791 | org.xwiki.filter.FilterException: Failed to read package | ||
| 792 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.preparePackage(ConfluenceInputFilterStream.java:400) | ||
| 793 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.readInternal(ConfluenceInputFilterStream.java:331) | ||
| 794 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.read(ConfluenceInputFilterStream.java:229) | ||
| 795 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.read(ConfluenceInputFilterStream.java:106) | ||
| 796 | at org.xwiki.filter.input.AbstractBeanInputFilterStream.read(AbstractBeanInputFilterStream.java:79) | ||
| 797 | at org.xwiki.filter.internal.job.FilterStreamConverterJob.runInternal(FilterStreamConverterJob.java:97) | ||
| 798 | at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:246) | ||
| 799 | at org.xwiki.job.AbstractJob.run(AbstractJob.java:223) | ||
| 800 | at org.xwiki.filter.script.internal.ScriptFilterStreamConverterJob.run(ScriptFilterStreamConverterJob.java:75) | ||
| 801 | at com.xwiki.confluencepro.internal.ConfluenceMigrationJob.runInternal(ConfluenceMigrationJob.java:159) | ||
| 802 | at org.xwiki.job.AbstractJob.runInContext(AbstractJob.java:246) | ||
| 803 | at org.xwiki.job.AbstractJob.run(AbstractJob.java:223) | ||
| 804 | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) | ||
| 805 | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) | ||
| 806 | at java.base/java.lang.Thread.run(Thread.java:840) | ||
| 807 | Caused by: org.xwiki.filter.FilterException: Failed to analyze the package index | ||
| 808 | at org.xwiki.contrib.confluence.filter.input.ConfluenceXMLPackage.read(ConfluenceXMLPackage.java:893) | ||
| 809 | at org.xwiki.contrib.confluence.filter.internal.input.ConfluenceInputFilterStream.preparePackage(ConfluenceInputFilterStream.java:391) | ||
| 810 | ... 14 more | ||
| 811 | {{/code}} | ||
| 812 | |||
| 813 | You will need to identify the problematic content and the problematic characters. Then, we know of two main ways to deal with this issue: | ||
| 814 | |||
| 815 | 1. Removing the bad characters from the pages in Confluence and then reexporting **without history** (with history is not possible, since the older versions of the pages contain the problematic characters). | ||
| 816 | 1. Remove the bad characters by editing ##entities.xml## | ||
| 817 | |||
| 818 | One way of idenfitying the problematic characters is to run a failing import with a debugger connected to the XWiki instance, and set breakpoints in ConfluenceXMLPackage from the stack trace, and identify the affected content(s), or work with a modified version of ConfluenceXMLPackage that logs the ids of objects it parses and see where it stops. Then, you will need to extract the relevant body content and inspect it in some editor that lets you see special characters or some hexadecimal viewer / editor. | ||
| 819 | |||
| 820 | You can try to use unix tools to search occurences of the illegal characters you identified. For example, let's say we identified this sequence of problematic characters: ##perl -nle '\xEF\xBF\xBF`## | ||
| 821 | |||
| 822 | You can count the occurences using the following command: | ||
| 823 | |||
| 824 | {{code language="bash"}} | ||
| 825 | perl -nle '$c+=scalar(()=m/\xEF\xBF\xBF/g);END{print $c}' entities.xml | ||
| 826 | {{/code}} | ||
| 827 | |||
| 828 | or: | ||
| 829 | |||
| 830 | {{code language="bash"}} | ||
| 831 | unzip -p myexport.xml.zip entities.xml | perl -nle '$c+=scalar(()=m/\xEF\xBF\xBF/g);END{print $c}' | ||
| 832 | {{/code}} | ||
| 833 | |||
| 834 | You can remove these characters in place using sed (make sure you can get the original entities.xml file) | ||
| 835 | |||
| 836 | {{code language=""}} | ||
| 837 | sed -i 's/\xEF\xBF\xBF//g' entities.xml | ||
| 838 | {{/code}} | ||
| 839 | |||
| 840 | and then add back entities.xml in a copy of the export zip archive. | ||
| 841 | |||
| 842 | Another example, with grep, where we had some content ending with 34004 NULL (\0) characters (!): | ||
| 843 | |||
| 844 | {{code language="bash"}} | ||
| 845 | grep --only-matching -a -P '\x00' entities.xml | wc -l | ||
| 846 | # answer: 34004 | ||
| 847 | sed -i 's/\x00//g' entities.xml | ||
| 848 | {{/code}} | ||
| 849 | |||
| 850 | If you work on separate files instead of in place, you can inspect the difference of size before and after fixing: | ||
| 851 | |||
| 852 | {{code language="none"}} | ||
| 853 | ls -la original/entities.xml fixed/entities.xml | ||
| 854 | -rw-r--r-- 1 raph raph 530607291 11 mars 09:02 fixed/entities.xml | ||
| 855 | -rw-r--r-- 1 raph raph 530641295 11 mars 08:51 original/entities.xml | ||
| 856 | {{/code}} | ||
| 857 | |||
| 858 | We have 530641295 - 530607291 = 34004. You can also check that only the problematic line(s) were modified by running diff: | ||
| 859 | |||
| 860 | {{code language="none"}} | ||
| 861 | diff original/entities.xml fixed/entities.xml | ||
| 862 | {{/code}} | ||
| 863 | |||
| 864 | {{info}} | ||
| 865 | For debugging and working with a Confluence packages, see [[extensions:Extension.Confluence.XML.Exploring Confluence Exports.WebHome]]. | ||
| 866 | {{/info}} |