Saturday, May 31, 2014

Gru on Stage

Just finished setting up a staging server on openshift. Here are the lessons learnt(and by extension the reason it too the better part of a week) getting it done.

1st, a small note about the 'cloud'. This is a generally misunderstood concept but at the heart of it, it boils down to someone else setting up the computer infrastructure you need to deploy your solution. It might be in the form of Infrasture (IaaS), Platform(PaaS) or Software(SaaS). With all of these however, you must understand the limitations and implications of different vendors because(believe me) they are different.

Openshift provides Platform as a Service and with that comes a runtime environment(e.g Python), Databases(e.g MySQL) and other server related stuff like a shell(yaay). That said, openshift however enforces restrictions, the most important (to us) being:
1. Restricted ports. You are NOT allowed to arbitrarily bind to ports from your application. You may only listen in on internal ports ranging 15000-35530 and even then, these ports are only visible internally.
2. WSGI compliance - By default, the python instance runs on a WSGI compliant server which presents a real headache when you want to run your own custom server instance(e.g Twisted).

With that understood, this how to replicate a Gru Instance setup.
1. Create an account with openshift(obviously)
2. Setup a Python(2.7) app
3. Add the necessary pieces needed i.e MySQL, phpmyadmin

Now clone the repo to your localhost and you'll see a file name wsgi.py in there. This the default file that openshift loads your app with and has 2 end points
a) Health - just returns 1 if the server is up and running
b) env - Returns a list of environment variables on the server. This should obviously be turned off on a production server.

Before continuing, it's imperative to understand that openshift does NOT maintain a system-wide installation of 3rd party modules. If your app uses any 3rd party(like ours does), the needed modules need to be added to setup.py so that everytime your app is being built, these dependencies will be installed.

Add the needed dependencies in the "install_requires" list.
Selector is a great 3rd party module for routing URLs in an easy way -- https://github.com/lukearno/selector/

Now, what we need to do is replace wsgi.py with our very own custom app.py. Openshift will instead load our app.py each time a request comes through.


Of importance to note is
Line  31 -- Where you point to your own python application(in this case index.py)


Note that the host for MySQLdb is the internal openshift IP(in shell $ echo $OPENSHIFT_MYSQL_DB_HOST).
Also take great note of how to call and return a method (view def status) as described here -- http://webpython.codepoint.net/wsgi_application_interface

As mentioned previously, the 'Selector' is a great 3rd party module for routing URLs as demonstrated by lines 53-58

Wednesday, May 28, 2014

Gru Setup and Config

Because I like incremental builds, I always start with the very basic functionality needed then grow outwards that way, scaling the environment as needed and refactoring code on the fly(sic).

Since we still haven't got an official home for Gru{he's still homeless :( } I've refrained from making this deployment procedure the official one on bitbucket's README document. However, for a quick and dirty implementation on localhost, here's how to go about it.

Pre-requisites
1. Linux Kernel 2.6+ (M$ Windoze people, sorry)
2. Python (preferably 2.7.x)
3. Selector (pip install selector) -- used for routing on a WSGI server
4. MySQLdb(apt-get install python-mysqldb) -- connector between python and mysql
5. MySQL Server 5.4+(Obviously)
6. Open port(8080)

Steps
Get index.py from the repo and run it with

python index.py &
 Always make sure to stop the server before starting again with
killall python

Error Reporting
Please report any (logical) errors here -- https://bitbucket.org/techxusteam/grus-server-geoaddress/issues?status=new&status=open

N/B: "Python Server is blocking port.."  is NOT an error! Python does not control port access, the OS does. Ask for help first before forming a conclusion on what's possible and what's not.

P/S: Because of the lack of immediate consistency in MySQL, DB changes were not immediately reflected. This has been corrected by explicitly opening and closing the db connection for each subsequent query(just like PHP).

EDIT: Updated README here <--

Sunday, May 18, 2014

Addressing and Naming -- Thoughts

While creating a completely new addressing scheme for geo-address, it's important to understand the requirements desired of the scheme.
1. Uniqueness -- Each address should be globally unique and identifiable.
2. Readability -- The addresses should be simple and easy for humans to read
3. Infinitely scalable -- Because geo-addresseses represent a GPS point on Earth, then the domain of possible available points is (almost)infinite. Hence, the design of the scheme should reflect this eventuality while still maintaining global uniqueness.

Naming and addressing convention usually feature multi-part variables e.g Personal Names(2-3 names), Vehicle Registration numbers(2+ parts) and IP addresses(4 parts). This methodology works well because it enables one to form a 'union' of two or more parts to get a unique identifier without using sequential numbering that would eventually out-grow the original targets.

In the case of geo-addresses, it's important that the addressing scheme inherits a high level of scalability for the future as well as the ability to add new variables to form the eventual *intersection. Therefore, I propose a scheme that would be constructed in the following notation:
A-B-C..-N where
A is the base variable which should be as wide as possible so as to minimize the need for other sub-set variables.
B is a subset variable that should ideally be able to map to a general localized area e.g(Chiromo)
C is the most precise point within the locale(i.e B)
N is other variables that would ideally hold metadata about the location(e.g routes, etc)
'-' is the separator between the different variables.
The aim is that An Bn Cn..∩..Ngenerates an almost infinite set.
where *∩ is the Intersection and
Xn is the entirety of that domain sub-set.

A point to note is that A or B or C etc need NOT all be in the same format. One can me Numeric, another alphabetical, and another alphanumeric. Also, the wide ASCII set of characters can let us use different characters as prefixes and postfixes to further define the address e.g 
@A -- Can represent that A in this instance is a named town/city
#A -- Can represent that A is geographical location based on co-ordinates
etc..

*Intersections enlarge the domain more than Unions.

References:
http://rina.tssg.org/docs/FutureNetTutorialPart2-100415.pdf

As always, comments and suggestions are welcome below :)

Saturday, May 17, 2014

Introduction

This is a technology blog where I will tell the story of building the Server. Reading it will (in plain English) describe how the Server is built and the challenges(with subsequent solutions) and lessons learnt from the process.

This will be a private blog(for obvious reasons), so please do remember to subscribe.

#crazywizard