WebRead

 

Syntax: WEBREAD(URL, Pattern, ErrorMessage, CodePage, DateTimeFormatSpecification, NumberFormatSpecification, AutoFormat)

 

Equivalent Function Names: WR

 

Purpose: The function downloads a website page from a given URL, parses it for a regular expression pattern and returns a range back. The results are cached using the standard Spearian's Market Data Cache.

 

Parameters: 
 

1.URL is complete URL of the webpage, including the protocol, e.g. http://
 

2.Pattern is the Regular expression pattern

Note: Pattern should not be longer than 255 characters.

If left blank, the default pattern of "(.*)" returning the complete html page is used.

Hint - use RegexBuddy or http://regexpal.com/ for testing of your expressions
 

3.ErrorMessage - if specified, it's returned instead of the default exception text
 

4.CodePage - the page used for parsing of the matched results to numbers and dates - the default value is "en-US". The list of possible values is at http://msdn.microsoft.com/en-us/goglobal/bb896001.aspx (check the column Culture Name)

In case you encounter any problems, see this article.
 

5.DateTimeFormatSpecification - if this field is present, the CodePage parameter is ignored for the conversions of Dates to Excel format. See the full list at http://msdn.microsoft.com/en-us/library/8kb3ddd4.aspx

Example: "Mon 23 Jun 8:45 AM 2013" will be recognized with this specification "ddd dd MMM h:mm tt yyyy"
 

6.NumberFormatSpecification  - if this field is present, the CodePage parameter is ignored for the conversions of Numbers to Excel format.

The parameter takes form of at least 4 characters:
1. Negative Sign character (1 character)
2. Currency Decimal Separator (1 character)
3. Currency Group Separator (1 character)
4. Currency Symbol (at least 1 character)

Example: "-.,$" is the standard setting for the US.

 

7.AutoFormat - an optional flag signifying if the result should be automatically converted to a range and if the result should be automatically formatted as Excel Data Types
 

 

Output: The matches range of the results. WARNING: The result cells are automatically converted to the right Excel type, e.g. date type.

 

NOTE: If you find any given formula valuable, it can be incorporated into the Spearian's Native (=super-fast range) formulas

 

Recommended procedure: If you see any useful statistics to put in to your spreadsheet, just open in the browser the website source and try to find the regular expression.

 

Examples:
1. Retrieve the top news headlines for a given stock

=WR("http://finance.yahoo.com/q/h?s=MSFT+Headlines","<div class=""mod yfi_quote_headline withsky""><ul class=""yfncnhl newsheadlines""></ul> <h3><span>([^<]*)</span></h3><ul><li><a href=""([^<]*)"">([^<]*)</a><cite>") returns the top headline for a given stock

 

 

2. Download beta for a given stock

=WR("http://finance.yahoo.com/q/ks?s=MSFT+Key+Statistics","Beta:</td><td class=""yfnc_tabledata1"">([^<]*)</td></tr>")

 

 

3. Check the contents of the page being processed

To check the html text file to be processed by regular expression, just use =WR("http://finance.yahoo.com/q/h?s=MSFT+Headlines","(.*)")

 

 

 

Copyright © 2013-2017 Jiri Pik

Document Version: Sunday, May 7, 2017