Identifiers used in LOAD RESPONSE_INFO BODY

The LOAD RESPONSE_INFO BODY command loads a character variable with all or part of the data from an HTTP response message body for a specified TCP connection. For a response body containing an HTML document, the WITH clause may be used to load a character variable with an element or part of an element from the document.

The WITH clause has the following format:

,WITH identifier

Note: identifier is a character variable, quoted character string, or character expression identifying the data to be retrieved from the HTML document in the response message body. The following sections describe the format of this identifier:

HTML Element Addressing

An element within an HTML document is identified by an element address string.

Format Definition:

tag(tagnum){/tag(tagnum)}:element-type:{attribute}(element-num)

tag

The HTML tag name.

tagnum

A number identifying the tag relative to its parent tag or the document root:

0 = First child tag
1 = Second child tag
n = nth child tag

element-type

The HTML element type. This must be one of the following:

ANONYMOUS ATTRIBUTE
ATTRIBUTE
COMMENT
SCRIPT
TEXT

attribute

For element-type ATTRIBUTE, specifies the name of the HTML attribute.

element-num

A number identifying the element. For element type ATTRIBUTE, the number identifies the attribute relative to its associated tag:

0 = First attribute
1 = Second attribute
n = nth attribute

Examples:

HTML(0)/BODY(1)/TABLE(1)/TBODY(0)/TR(0)/TD(0):TEXT:(0)
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)

Note: There must be no whitespace between any of the components of an identifier.

Note: Identifiers are not validated at compile time.

Qualifying an HTML Element Address

A complete HTML element string may be retrieved from an HTML document using an identifier containing only an HTML element address. However, a substring may be selected from it using a variety of qualifiers. These qualifiers immediately follow the HTML element address and are described below.

Selecting a Substring by Position and Length

An HTML element substring may be selected using an identifier specifying the offset of the substring and its length.

Format Definition:

element-addr[offset, length]

element-addr

The HTML element address in the format described above.

offset

The offset of the first character of the substring from the start of the element string.

length

The number of characters in the substring.

Note: If the offset is invalid, an empty string is returned.

Note: If the length is zero, or is invalid, all characters from the start offset to the end of the element string are returned.

Example:

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)[2,5]

Selecting a Substring using Delimiters

An HTML element substring may be selected by specifying an identifier containing two string delimiters. The substring returned contains all the characters between the first occurrence of the first delimiter and the first occurrence of the second. The string will also include both delimiter strings.

Format Definition:

element-addr[delimiter1, delimiter2]

element-addr

The HTML element address in the format described above.

delimiter1

A string - enclosed in single quotes - identifying the characters at the beginning of the substring.

delimiter2

A string - enclosed in single quotes - identifying the characters at the end of the substring.

Note: If delimiter1 cannot be found, an empty string is returned.

Note: If delimiter2 cannot be found, all characters from and including delimiter1 to the end of the element string are returned.

Example:

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)['document.cookie=',';']

Selecting a Substring Using Position, Length and Delimiter String

The above two methods of substring selection can be combined, allowing an HTML element substring to be identified by a start string and a length or an offset and a termination string.

Format Definition:

element-addr[delimiter1, length]
   or
element-addr[offset, delimeter2]

element-addr

The HTML element address in the format described above.

delimiter1

A string - enclosed in single quotes - identifying the characters at the beginning of the substring.

length

The number of characters in the substring.

offset

The offset of the first character of the substring from the start of the element string.

delimiter2

A string - enclosed in single quotes - identifying the characters at the end of the substring.

Note: If delimiter1 cannot be found, an empty string is returned.

Note: If the offset is invalid, an empty string is returned.

Note: If delimiter2 cannot be found, all characters after, and including, delimiter1 to the end of the element string are returned.

Note: If the length is zero, or is invalid, all characters from the specified offset to the end of the element string are returned.

Examples:

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)['cookie=',3]
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)[2,';']

Excluding Delimiters from Selection

With the syntax described above, any delimiter strings specified are included in the returned substring. Either or both delimiters may be excluded from the returned substring by inverting the square bracket nearest to the delimiter, i.e. using an opening square bracket in place of a closing square bracket and vice versa.

This method can also be used with offset parameters. Instead of identifying the offset of the first character of the substring to be selected, using this alternative syntax, the offset becomes the offset of the character immediately before the first character to be selected.

The following examples illustrate how a substring may be selected from the CONTENT attribute string of an HTML META tag.

This example selects the substring that starts at offset 3 from the beginning of the attribute string and that is terminated by the next semicolon (included).

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]2,';']

This example selects the substring that starts at offset 2 from the beginning of the attribute string and that is terminated by the next semicolon (not included).

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)[2,';'[

This example selects the substring that starts at offset 3 from the beginning of the attribute string and that is terminated by the next semicolon (not included).

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]2,';'[

Ignoring the Characters at the Beginning of an HTML Element

There are occasions when it is useful to use the above facilities starting from some point within the element string, rather than at the beginning of the string. This can be achieved by resetting the selection base. This can be done by specifying the selection base as an offset from the beginning of the element string, or by specifying a substring that identifies the characters at the beginning of the substring to be examined. The offset or substring is preceded by one of two operators > or >=:

format	meaning
>offset	The offset is that of the character immediately before the substring to be examined.
>substring	The substring identifies the characters at the end of the string to be ignored. The substring starts with the first character after the substring.
>=offset	The offset is that of the first character in the substring to be examined.
>=substring	The substring identifies the characters at the beginning of the substring to be examined.

Note: If the offset or substring cannot be found, an empty string is returned.

The following examples illustrate how the selection base is reset for a selection from the CONTENT attribute string of an HTML META tag.

In this example the selection base offset is set to the offset of the first character after the first occurrence of the string // Cookie in the element string. The selected substring starts with the character after document.cookie= and ends with the next semicolon (included).

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>'// Cookie','document.cookie=',';']

Same as above, except that the selection base offset is now the first character of // Cookie.

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>='// Cookie','document.cookie=',';']

Same as above, except that selection base offset is now 50 characters from the start of the element string.

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>=50,'document.cookie=',';']

Ignoring the Case of Characters

All string comparisons specified by LOAD RESPONSE_INFO BODY identifiers are by default case sensitive. The case of characters can be ignored in comparisons by prefixing the search string or delimiter string by I.

In the example below the selection base is reset by searching the element string for // Cookie; the case of characters is ignored in the search.

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>I'// Cookie',I'document.cookie=',';']

Specifying Quotes Within Identifiers

Quoted character strings within SCL are delimited, either by single quotes or by double quotes. Since the syntax of a LOAD RESPONSE_INFO BODY identifier includes single quotes, it is recommended that double quotes are used to delimit a quoted character string containing such an identifier.

A literal single quote character can be included within an identifier string by preceding it with a backslash. For example, this selects a substring terminated by a single quote:

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:XYZZY(1)[0,'\'']

A literal double quote character can be specified within an identifier string, using the SCL character command, ~<22>. For example, this selects a substring terminated by a double quote:

HTML(0)/HEAD(0)/META(1):ATTRIBUTE:XYZZY(1)[0,'~<22>']

<<<
prev page

^^^
section start

>>>
next page