Getting started with hypergraphs

What are hypergraphs?

In mathematics, a hypergraph is a generalization of a graph in which an edge can connect any number of vertices. -- wikipedia

In other words, hypergraph is a tuple of set of hyperedges (sets of vertices) and set of vertices.

hypergraph

There exists a theorem that every hypergraph H may be represented by a bipartite graph BG:

bipartite graph

the sets X and E are the partitions of BG, and (x1, e1) are connected with an edge if and only if vertex x1 is contained in edge e1 in H. Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above.

In simple words, to get a coresponding bipartite graph, get nodes of hypergraph - lets call them U, get all the hyperedges, convert them to nodes, naming them after the nodes they link - they are now the second set of nodes, V. Having two groups of nodes, connect former hyperedges (now, members of set V) with the nodes they linked in hypergraph. Every member of V is connected only with members of U, and other way round. It's a definition of a bipartite graph!

Motivation

My thesis is modelling difussion on hypergraphs. Hypergraphs are very useful structures and they're used in many applications. There are some examples from mathoverflow.

However I think that they're much less known than graphs. I even had some problems to find books about them in my university library...

If you're interested in learning hypergraph theory, there are some books on the amazon which are great introduction.

Representation of hypergraphs

I was looking for a good library for representing hypergraphs in the internet, but I haven't found anything satisfatory. There are lots of tools for graphs instead. And it wasn't only my problem, I found this question on stackoverflow - which asks for tools. However answers aren't any good, question linked to a paper about visualization of hypergraphs - Visualization of Hyperedges in Fixed Graph Layouts (PDF) which looks really promising.

I plan to develop my own python library/package to visualize hypergraphs using the excellent matplotlib library.

Programming hypergraphs

To program hypergraphs I use NetworkX library which is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

NetworkX is really great. It's easy to use, open source (BSD license), it has plenty of useful functionalities, well written documentation, tutorial and gallery of examples.

To start my programming adventure with hypergraphs I used ipython notebook. You can see it here. Here is also a whole github repository of my work. I'm just getting started though, so it's not so perfect as I would like it to be.

Learning Physics

Exams

In Poland there is an examination session twice a year.

After approximately to weeks of hectic work on finishing projects, passing all the test to be able to write exams we jump right into examination session. Semester ending on Friday, first exam next Monday.

It's pretty overwhelming. But what do. Learn. I don't like thisform of grading not because of stress, big amount of knowledge to learn in little time. I don't like it because this material is so inflexible and sometimes stupid and because of that so boring and hardly applicable.

I enjoy learning physics, but I don't learning it for exams.

Wikipedia in the process of learning

What I also hate is poor quality of polish wikipedia articles. In less than a week I read a lot of bullshit or just strongly incomplete articles. I would like to spend some time to work on wikipedia, but I don't feel like an expert in those fields I'm talking about. Why would I read those articles anyway? Articles in English are much better - it's very helpful.

I like definition articles because they're striving to be self contained and are exactly about defining and describing, which is what theoretical exams want us to do.

Stupid students

To end my today rant I add a few words about "students learning materials". There are a few documents named script for That and That. Some of them are poorly written, some are bit better and quite helpful in preparation to the exam. Their written in LaTeX, it's easy to notice it's font. But there are mistakes... some are minor typos, some are just incorrect information or are incomplete. I would gladly work on those materials, but they are distributed in pdf format. Great for collaboration isn't it?

So we as a students pass this materials around, learn from them a bit, pass to next generation... And those materials are slightly deteriorating. Why? Are we to stupid to edit LaTeX? No. Someone already wrote it in LaTeX. Can't we use version control? Github is waiting for us with unlimited public repositories and great collaboration structure. There is also a bitbucket if we want to make it private. There weren't such tools in for example 2004, but they are in 2014, we should take advantage of them.

But how many people work with git if they don't have CS background? They would barely download the pdf, print it, make some notes on the side, use some colors, learn from it and nothing would be merged back into community from their knowledge. Nothing. Nothing will improve.

But we could have nice things...

Doing the right things

How do we do right things? I'm sure that almost everyone can divide between good and evil (even though there are different points of view).

But the right things in the right time? There is always a difference between the things you ought to do, you want to do and you actually do and those which would be the best for you in retrospection.

Optimizing

Premature optimization is the root of all evil -- DonaldKnuth

Optimization is hard, we have methods to do it, you can learn them for example in this course. I'm talking about an optimization here, because I often find it hard to decide what would be good for me, I'm not sure in setting my own goals and so on and when I actually try optimizing doing real things, I tend to be less productive that I would probably be just doing the stuff sequentially.

There are two sides of optimization: planning and execution. I think that the biggest obstacle is unsureness about the future. When you have much control, optimizing can be very rewarding, but when random things happen it becomes less efective and more stresful and we can end up with premature optimization.

Learning and creating

Should I spend more time learning new things or mastering old ones? Should I gain a profound knowledge in some subject or just pass a test?

Do I really care about it? Will I need it? Or it's just seems exciting, but I'm too tired to get anything from it?

And those questions are only about learning, what about actual creation or just doing stuff because it creates value?

Creation takes time, same with solving real problems for example with business logic or doing projects. The more you do, the less time you have for next interesting new thing.

Maybe you are sticking to your last project because of sunk costs? Or your afraid to be accused of never finishing anyth?

But what if you're just wasting your time and resources?

One of my friends says that's it's important to do something. And what it is exactly doesn't really matter. However I'm not sure about it, I just can't not evaluate things, some are more valuable and useful and some are less.

The other important distinction for me is useful/valuable and good for your development. Lot's of mundane things which I do aren't most optimal for my development, but are useful and good, etc.

I'm a learner type. I really enjoy new articles, libraries, theories, etc. Learning new skills I enhance my potential, I become a better ... whatever I do in this moment. But I can't just have more and more potential.

What for? I have to spend it somewhere! Do things, to feel useful, productive and accomplish something (not only excercise). The bittest side of learning is forgetting. Learning and forgetting. To remember new knowledge and truly understand it you should use it to solve real problems. And real problems are hard. And you have to be prepared and take your time to learn things which will enable you to solve it.

It might sounds a bit like a gibberish, but doing (both creating and executing) and learning cannot escape each other. And to gain great knowledge and accomplish great things you'll need both. But right balance?

Learning and inspiration

I realized that reading "technical blogs" I rarely learn something nontrivial. I learn by solving problems by reading long books on subject. But reading blogs articles is wortwhile nonetheless because I often get inspired by them to try something new or now more about industry, feeling of people similar to me. It's entertaining but a bit shallow.

Reading long, complicated books or doing courses such us edX or coursera is hard, takes time, and you often don't get enough feedback (especially from books ;).

I often find my studies (applied physics) both too specialised and not comprehensive enough.

And I'm not really sure what exactly want to learn. Everything is not an answer, at least in the nearest future. And getting to much inspiration creates lots of wasted time and not accomplishing anything.

Decisions

Summing up, I don't believe in real-time life optimization, don't know what I want to do or learn precisely.

I wanted to set some goals because of new year resolutions, but I'm procrastinating. I'm procrastingating because I don't feel so sure if I would be able to finish them or if I wouldn't change my mind. But without big goals I'm driven by other people and I like a feeling of control over my life and satisfaction from accomplishments.

I'm looking for the right things to do.

Great abundance of python projects

Pycoder projects of 2013

I read Pycoder newsletter. It supplies few new cool open source projects, discussion threads and job offers every week (archive) It's amazing to know what is happening in your community and some of this projects prove to be very useful or just inspiring.

But this week their mail was so overwhelming! (and also very cool) In their newest mail they grouped and listed coolest python projects of the year. There is plenty of them in categories such us:

  • web development
  • data visualization
  • devops tools
  • debugging
  • testing

and more.

I would need a year to get through all this amazing projects. Sadly I have lots to do and little time (my projects won't develop itself without my help).

Some of the projects which really gained my interest:

Falcon is a high-performance Python framework for building cloud APIs. It encourages the REST architectural style, and tries to do as little as possible while remaining highly effective.

  • isso - open source project similiar to discuss (worth trying/contributing?)

  • cookie cutter

A command-line utility that creates projects from cookiecutters (project templates), e.g. creating a Python package project from a Python package project template.

I was looking for some tool to automate setting my flask/django apps and probably I found it. Cookiecutter has its own cookiecutter-flask :

A Flask template with Bootstrap 3, starter templates, and working user registration.

  • sure - this test framework with monkey patching is just too cool to be true. Damn, I would love to use it in all my projects! I like my unit tests in unittest(2) but often I'm a bit tired of all that 'javaish' boilerplate.

  • pulsar concurrent framework for python, written with python 3 in mind. No more: "Damn, I want this nice concurrency library, but noone (yes, gevent, I'm looking at you) supports python3, lets do it in python2"

  • django-xadmin drop in replacement of django admin with lots of additional goodness

  • simmetrica - Lightweight framework for collecting and aggregating event metrics as timeseries data

  • fn functional tools for python

My afterthoughts

And there is more. sighs

I have a dillema because I would like that I won't use them all and it can be even not the best idea ever, because some new hype python projects are just half-baked and inconviniet too use. I had some so-so experience with pydown - it worked but I'd probably be much more productive in almost anything.

I don't even have time to learn all those great well established libraries, which I use or like to introduce in my new projects. I don't know if learning faster would be a solution to this problem. Probably not reading the internet and not knowing about all this amazing stuff would solve my problem with being excited about too many projects and not being able to learn more about them.

Projects which proved most useful to me

This year I developed a lot in the web. I used mainly django and flask.

They have different feel but both are production ready, well documented, fun to use and have lots of available plugins. Great tools to get things done.

I would die without the great duo:

  • pip - installing python packages with ease
  • virtualenv - creating virtual environments for python projects

This year I found out great power of IPython with its notebook. IPython made my python development much more convinient. I feel pain when I launch normal python shell by mistake. IPython features are indispensable. IPython notebook introduces total new quality. I tend to use it for everything - work, personal projects, research and studies. It's amazing. I even wrote posts about this tools.

I moved my blog to nikola which works really smoothly.

For real deployment I started using:

Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.

Sentry is a realtime, platform-agnostic error logging and aggregation platform

Summary

All this projects are amazing and written in python. But there is lots of other programming languages and I don't know their communities and most famous and useful projects. Very interesting idea is porting various libraries between languages.

I sometimes wander if we reinvent the weel all the time, just look at pages such as this one which lists just static site generators.

But maybe it's actually pretty good? The strongest/best will survive chosen by demanding developers and those which won't survive will provide inspiration for newer ones.

Seven Databases

Data needs to be stored

It's pretty obvious, we want our data to be persistent, available and consistent. Hence we have big expectations for our databases. We can't have everything - which is nicely expressed by CAP theorem.

I often read about different databases on blogs such as this excellent one.

But what to choose? What do I need in my app? Choice is often very hard. It's even harder when one doesn't know what are actual advantages and disadvantages of different databases.

Czytaj więcej…

Share