Sunday, July 21, 2019

Construction of Information Engine

Construction of Information Engine Coiza Making sense of information   1. Introduction I more or less run my life directed by information from the Internet. I check the weather, check the traffic, look for places to go, look for reviews for the places, get updates from my friends and work and browse lots of information from many other sources. The information I am interested in depends on many factors including the time of the day, my location, whether it is a work day or a weekend, whether I am at home or on vacation. Indeed many times the information I am interested is prompted by information I have already discovered. For example, on a work day I might check the weather and, if it looks like rain, check the availability of trains to get me home (well I wouldnt want to get wet!). Technology like Google Now do a great job of automating information assimilation by guessing what information relevant is relevant to me. The challenge of this type of technology is that assimilation, particularly across many information sources, can be complex and not easy to guess. An alternative approach is to explicitly define the rules by which information is assimilated in a way that can then be automatically processed by what I call Information Engines. In this article I want to talk about an Information Engine that I have constructed called Coiza. Coiza is built around information channels that can be subscribed to and which display information as feeds like those used by Facebook and Twitter. Channels may display raw information, for example a news channel (like the BBC), or may display information resulting from combining information, for example location and Wikipedia summaries for that location. Channels can produce any information including context information like location and time of day. The most interesting feature of Coiza is that it allows the definition of new channels based on existing channels and rules on how the information from the existing channels gets used to produce information from a new channel. 2. Viewing Channels Channels can be subscribed to within Coiza. Depending on the channel it may be necessary to supply parameter values and/or authorisations for Coiza to access private information (e.g. Google Calendar) using the OAuth protocol. Once channels have been subscribed to then information is displayed from that channel in a feed like format where the feed is hidden if there is no information to display. 3. Creating Channels Viewing channels is where most users will spend the majority of the time, but the richness of channels available to view is enabled by ability to build new channels with relative ease. Any user within Coiza can create and publish channels by writing CoizaLang code. CoizaLang code consists of two primary constructs: Info A model of a piece of information that is either consumed or produced by a channel and can be rendered within feeds. Channel Consumes zero or more info flows, emits a single info flow, and defines rules for producing the output flow. Channels may be nested within each other. 3.1 Building Infos Here is an example of a CoizaLang info for Message illustrating the key features of infos. Firstly like all constructs, infos live within a namespace, or package, in this case coiza since it is supplied as part of the coiza platform. All infos (and for that matter channels) must live in a namespace beginning with the username of the coiza user that created it which in my case is jwillans. Infos can subtype, or extend, other infos which, as we shall see a bit later, allows the same instance of a type to play different roles depending on the channel using it. In this example Message subtypes TitledContent and, in addition to having the link and author fields defined locally, title and content fields are inherited through the sub type relationship. Fields can be typed using primitive values or other info types. A further important feature of infos is the optional render block which defines how infos are turned into html for display within a feed. When a subscribed channel is displayed (see the screenshot in Section 2) the supplied feed is a result of turning infos into html using the render block. Render blocks support a subset of html along with the ability to reference and navigate info fields using a small expression language. 3.2 Building Channels Channels are the bread and butter of coiza. A CoizaLang channel has zero or more input ports, a single output port, all of which are typed by infos. The job of a channel is to produce output infos often as a result of processing input infos. The resulting infos can then either be displayed as feeds, assuming the channel has being subscribed to, and/or used as the input to a further channel. In this way networks of channels can be created building on one another. 3.2.1 Getting the news Here is a simple example of a channel which does not do any processing directly but wraps the existing channel RSSFeedProvider to define a BBC news channel. I sometimes call these types of channels assembly channels. RSSFeedProvider is one of a number of channels that hook in to externally supplied data, in this case getting information from an RSS feed. Other example of external data channels in coiza includes Google Calendar, IMAP email, LinkedIn, current location, Wikipedia, currency converters à ¢Ã¢â€š ¬Ã‚ ¦ and the list is growing all the time. From a coiza point of view these behave exactly like any other (user defined) channel. Like infos, channels are named and live within a package namespace. A channel can have zero or more parameters which are declared in the parenthesis after the channel name (line 5). In the case of the BBC News channel no parameters are required. However RSSFeedProvider does have a single string parameter defining the RSS feed location, and the URL of the BBC news feed is supplied as an argument to the RSS feed (line 7). This BBC News channel has no input ports but defines a single output port (line 9) which simply takes the output of RSSFeedProvider. By the way, although it cannot be seen from the above code, RSSFeedProvider produces infos of type Message which we covered in the previous section. 3.2.2 Filtering the news Lets get more adventurous and explore some of the other features of CoizaLang. Suppose we wanted to filter the news by title, we could define a further channel as follows: FilteredTitle demonstrates a parameterised channel, requiring a filter string, with both input (+) and output (-) ports and a body that does some processing. Note how the ports are typed as Titled infoswhich is the base type Message subtypes thereby enabling this channel to filter titles on any type that extends Titled including Messages. The body of the channel iterates over all the incoming infos from feed and filters them using a pattern (line 11) which essentially says that an info must be of type Titled and the title field must contain the value of filter. Any matching infos are emitted to the output port. Now that we have a filter channel we can create a new assembly channel to filter the BBC news leveraging the two channels we have created. Most of the features of this channel has been illustrated previously, the one new feature is the wire declaration (line 14) which, as you may guess, defines how output ports are connected to input ports. In this case how the output of the BBC news channel is the input of the filter title channel. The output of this channel is then the output of the assembly channel (line 12). 3.2.3 Publishing the news During developing a channel it is possible to test the channel in order to ensure it works as designed as shown below. For a user channel to be subscribable, and used outside of testing, then it is important to guarantee that it is not going to change. To do this, a channel must be published which then prevents change. Before publication can happen, all infos or channels that are referenced by the channel must also be published and the channel must not have any type checking issues (there is no sense in publishing a channel that wont work). Unpublishing can only take place if the construct being unpublished has no dependents either in the form of other constructs or user subscriptions. If a change is required to a published channel with dependents then the only approach is to create a new version of that channel (or indeed info). We have created a couple of channels BBC News and Filtered BBC News that once published can be subscribed to by any user. Rather than the user having to search for the CoizaLang channel name (i.e. BBCNews or FilteredBBCNews) it is possible to give the channel a user friendly name along with a description which are both used as part of the search for subscribable channels mechanism. 3.2.4 Tell me in the morning Youve probably got a handle now on how coiza works and how anyone can build channels and those channels once published can either be subscribed to or used as a basis of further channels. By way of a final example, if Bob Brown publishes a channel to filter based on the time of day, then we can create another BBC News channel which filters both on the title and the time of day. 4. Summary I have talked about how Information Engines can help bring information together into a form that is more appropriate to what the users is interested in knowing, and I have walked through an example of an Information Engine I have constructed called Coiza. Hopefully Coiza looks useful and you will consider becoming a subscriber to the rich array of channels that are beginning to be defined or indeed define one or more channel for yourself. Finally in case you were wondering why is Coiza called Coiza it comes from the Portuguese word coisa meaning thing!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.