posted 12 years ago
Depends on whether you are designing for a large number of clients or a small number of clients. If you have a large number of clients, you could have every client broadcast itself every few minutes. With your current solutions, if you have a N clients, you will have N*(N-1) messages going back and forth when the N+1 client joins in. It might be cheaper for all N clients to do a broadcast every 5 minutes or so
Alternatively, you can have one of the nodes be the "master". The nodes can spin a random number to decide whether they should be the master. If they decide to be master, they broadcast a message. From this point on, the master is responsible for broadcasting a master list of clients every few minutes. Once all the clients see someone has decided to be the master, they stop spinning the random number. If they don't hear from the master every few minutes, they will go back to a mode where they spin the random number again. This implementation is probably the most lightweight, but most difficult to implement. You will have to decide what will you do if 2 nodes decide to be the master?how do you handle the "split brain" problem? for example: 4 nodes, A, B, C, D. A and B are in europe, C and D are in US. A is the master. Something happens and there is a disconnection between europe and america. So, C and D don't hear from A. So, they decide C is the master. Now, you have 2 masters, A and C, both of them don't know that there is another master. In few minutes, through the magic of internet, the connection is established again. Now what do you do? Should A and C try to find each other again? or do you live with the split brain.
Besides this, you probably want to think about how you want to handle very large number of clients, if you are going that route. Keeping disconnections aside. Everyone talking to everyone is a O(N^2) solution, which won't scale well. The beautiful thing about a client-server architecture is that it's an O(N) solution. Obviously the problem with client server architecture is that the server is your failure point, which I guess you really want to avoid.
I believe there are protocols like ToR that have solved this problem. I think ToR arranges the nodes in a tree like fashion.. although I might be wrong.