LOAD RESPONSE_INFO BODY Identifiers
The LOAD RESPONSE_INFO BODY command loads a character variable with all or part of the data from an HTTP response message body for a specified TCP connection. For a response body containing an HTML document, the "WITH" clause may be used to load a character variable with an element or part of an element from the document.
The "WITH" clause has the following format:
Note: identifier is a character variable, quoted character string or character expression identifying the data to be retrieved from the HTML document in the response message body. The following sections describe the format of this identifier.
HTML Element Addressing
An element within an HTML document is identified by an element address string.
Format:
tag(tagnum){/tag(tagnum)}:element_type:{attribute}(element_num)
Parameters:
tag
tagnum
A number identifying the tag relative to its parent tag or the document root.
0 = First child tag
1 = Second child tag
n = nth child tagelement_type
The HTML element type. This must be one of the following:
attribute
For element_type ATTRIBUTE, specifies the name of the HTML attribute.
element_num
A number identifying the element. For element type ATTRIBUTE, the number identifies the attribute relative to its associated tag.
Examples:
HTML(0)/BODY(1)/TABLE(1)/TBODY(0)/TR(0)/TD(0):TEXT:(0) HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)
- There must be no whitespace between any of the components of an identifier.
- Identifiers are not validated at compile time.
Qualifying an HTML Element Address
A complete HTML element string may be retrieved from an HTML document using an identifier containing only an HTML element address. However, a substring may be selected from it using a variety of qualifiers. These qualifiers immediately follow the HTML element address and are described below.
Selecting a Substring by Position and Length
An HTML element substring may be selected using an identifier specifying the offset of the substring and its length.
Format:
where "[" and "]" are literal characters and part of the required syntax.
Parameters:
element_addr
The HTML element address in the format described above.
offset
The offset of the first character of the substring from the start of the element string.
length
The number of characters in the substring.
- If the offset is invalid, an empty string is returned.
- If the length is zero, or is invalid, all characters from the start offset to the end of the element string are returned.
Example:
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)[2,5]
Selecting a Substring using Delimiters
An HTML element substring may be selected by specifying an identifier containing two string delimiters. The substring returned contains all the characters between the first occurrence of the first delimiter and the first occurrence of the second. The string will also include both delimiter strings.
Format:
element_addr[delimiter1,delimiter2]
where "[" and "]" are literal characters and part of the required syntax.
Parameters:
element_addr
The HTML element address in the format described above.
delimiter1
A string - enclosed in single quotes - identifying the characters at the beginning of the substring.
delimiter2
A string - enclosed in single quotes - identifying the characters at the end of the
- If delimiter1 cannot be found, an empty string is returned.
- If delimiter2 cannot be found, all characters from and including delimiter1 to the end of the element string are returned.
Example:
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)['document.cookie=',';']
Selecting a Substring Using Position, Length and Delimiter String
The above two methods of substring selection can be combined, allowing an HTML element substring to be identified by a start string and a length or an offset and a termination string.
Format:
element_addr[delimiter1,length]
element_addr[offset,delimeter2]
where "[" and "]" are literal characters and part of the required syntax.
Parameters:
element_addr
The HTML element address in the format described above.
delimiter1
A string - enclosed in single quotes - identifying the characters at the beginning of the substring.
length
The number of characters in the substring.
offset
The offset of the first character of the substring from the start of the element string.
delimiter2
A string - enclosed in single quotes - identifying the characters at the end of the
- If delimiter1 cannot be found, an empty string is returned.
- If the offset is invalid, an empty string is returned.
- If delimiter2 cannot be found, all characters after, and including, delimiter1 to the end of the element string are returned.
- If the length is zero, or is invalid, all characters from the specified offset to the end of the element string are returned.
Examples:
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)['cookie=',3] HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)[2,';']Excluding Delimiters from Selection
With the syntax described above, any delimiter strings specified are included in the returned substring. Either or both delimiters may be excluded from the returned substring by inverting the square bracket nearest to the delimiter, i.e. using an opening square bracket in place of a closing square bracket and vice versa.
This method can also be used with offset parameters. Instead of identifying the offset of the first character of the substring to be selected, using this alternative syntax, the offset becomes the offset of the character immediately before the first character to be selected.
The following examples illustrate how a substring may be selected from the CONTENT attribute string of an HTML META tag.
Examples:
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]2,';']Selects the substring that starts at offset 3 from the beginning of the attribute string and that is terminated by - and includes - the next semicolon in the string.
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)[2,';'[Selects the substring that starts at offset 2 from the beginning of the attribute string and that is terminated by - but does not include - the next semicolon in
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]2,';'[Selects the substring that starts at offset 3 from the beginning of the attribute string and that is terminated by - but does not include - the next semicolon in the string.
Ignoring the Characters at the Beginning of an HTML Element
There are occasions when it is useful to use the above facilities starting from some point within the element string, rather than at the beginning of the string. This can be achieved by resetting the selection base. This can be done by specifying the selection base as an offset from the beginning of the element string, or by specifying a substring that identifies the characters at the beginning of the substring to be examined. The offset or substring is preceded by one of two operators ">" or ">=":
>offset
The offset is that of the character immediately before the substring to be examined.
>substring
The substring identifies the characters at the end of the string to be ignored. The substring starts with the first character after the substring.
>=offset
The offset is that of the first character in the substring to be examined.
>=substring
The substring identifies the characters at the beginning of the substring to be examined.
If the offset or substring cannot be found, an empty string is returned.
The following examples illustrate how the selection base is reset for a selection from the CONTENT attribute string of an HTML META tag.
Examples:
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>'// Cookie','document.cookie=',';']The selection base offset is set to the offset of the first character after the first occurrence of the string "// Cookie" in the element string. The selected substring starts with the character after "document.cookie=" and ends with - and includes - the next semicolon.
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>='// Cookie','document.cookie=',';']Same as above, except that the selection base offset is now the first character of "// Cookie".
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>=50,'document.cookie=',' ;']Same as above, except that selection base offset is now 50 characters from the start of the element string.
Ignoring the Case of Characters
All string comparisons specified by LOAD RESPONSE_INFO BODY identifiers are by default case sensitive. The case of characters can be ignored in comparisons by prefixing the search string or delimiter string by "I".
Example:
HTML(0)/HEAD(0)/META(1):ATTRIBUTE:CONTENT(1)]>I'// Cookie',I'document.cookie=',';']The selection base is reset by searching the element string for "// Cookie"; the case of characters is ignored in the search.
Specifying Quotes Within Identifiers
Quoted character strings within SCL are delimited, either by single quotes or by double quotes. Since the syntax of a LOAD RESPONSE_INFO BODY identifier includes single quotes, it is recommended that double quotes are used to delimit a quoted character string containing such an identifier.
A literal single quote character can be included within an identifier string by preceding it with a backslash. For example:
"HTML(0)/HEAD(0)/META(1):ATTRIBUTE:XYZZY(1)[0,'\'']"This selects a substring terminated by a single quote.
A literal double quote character can be specified within an identifier string, using the SCL character command, ~<22>. For example,
"HTML(0)/HEAD(0)/META(1):ATTRIBUTE:XYZZY(1)[0,'~<22>']"
OpenSTA.org Mailing Lists Further enquiries Documentation feedback CYRANO.com |