Jul 11, 2019
There have been a bunch of technological (and some policy) changes over the past couple decades that make starting a new town a lot easier:
- Cheap solar power
- Cheap battery energy storage
- The internet replacing phone and TV (you can build internet infrastructure yourself)
- Online shopping (if you can figure out shipping you can buy everything people in cities can)
- Homeschooling (which became legal in most states in the 90s)
- Some level of autonomous cars (they're here today if you own the roads!)
- Telecommuting and online marketplaces
- Online communities make organizing people around a common project easier
Combine that with cost of living going up in cities and you'd expect lots of people to be clubbing together with their friends, buying land in the middle of nowhere and starting new communities. But that basically doesn't happen. Why?
Click to read and post comments
Jul 10, 2019
There are basically three models for content distribution on the internet.
- Centralized platforms
- Directories and publisher-hosts
Those three models have different pros and cons, but which one wins out has a lot to do with the specifics of the regulatory environment. With section 230 under attack it's a good idea to take a look at them and think about how they might shift in popularity if it gets repealed or changed.
This is the predominant model now (in 2019). In this model there's a centralized organization (usually a company) that has servers that store all the content that's available for consumers to consume and distributes it to them when they ask for it. It also provides the discovery mechanism that helps consumers find content that's interesting to them. Facebook, Youtube, Twitter, Twitch, and Instagram are examples of systems that follow this model.
The benefits of doing things this way are:
- It's easy for the people running the platform to make changes. Those changes could be improvements designed to make the platform more engaging or mitigations in response to changes in the environment (eg: new kinds of spam)
- It has the potential to be more efficient since the company can build infrastructure
The drawbacks are:
- There's a central point of control. If there's a moral panic around a particular type of content (whether that's industrial music after columbine or "fake news" now) it's easy for politicians to put pressure on the system. Having a big company you can threaten to regulate looks a lot like 20th century industrial policy and politicians are very comfortable in that paradigm.
- It's impossible for consumers to adapt the system to work better for them; what you get is what the company chooses to provide. That can lead to an inferior experience for people who live in different circumstances from the people who work at these platform companies (eg: the poor).
- These companies get considerable market power and can use that power to stifle alternatives and protect whatever business model they chose.
Directories and publisher-hosts
This is the model that podcasts use now, and the model that most content used in the early-mid 2000s. There are two roles in this model; directories and publisher-hosts.
Publisher-hosts are the people who create content, but in this model those are also the people who make the content they created available to others. This blog, which is hosted by me on a raspberry pi in my apartment, is an example of this. When you clicked the link that brought you here your web browser made a connection to my home internet connection and asked my computer to send this web page to you. This eliminates platforms and hosting providers. As another example, when you download a podcast episode, that content is coming from whatever computer the person that produced the podcast chose to use, not a server run by the company that made the podcast app you're using.
Directories are sites and apps that help people find content they might like, but don't produce or host any content themselves. Podcast directories are an example of this. The itunes podcast directory has information on a lot of podcasts that it lets you browse through, but when you actually go to download a podcast it directs your computer to whatever server the producer of that podcast chose.
- It allows for a high degree of consumer choice. For example, since podcasts are hosted all over the place there's no big company that can say which podcast client apps are allowed and which aren't, so consumers can choose which client works the best for them.
- Since there are potentially many directories referring people to the same content there's the potential for choice and specialization in directories. If you want to make a podcast directory that only lists podcasts that are consistent with a particular religion or about a particular topic you can do that.
- It doesn't matter whether hosts are considered publishers because in this model the hosts are the authors, and authors are already liable for their own speech (eg: if they defame someone)
- Being an ecosystem and not a single company it's very difficult to make sweeping changes that apply everywhere (this could be a benefit depending on your point of view)
- It's more work for content producers
- It's potentially less efficient because there's less purpose-built infrastructure
- Content producers need to pay for network bandwidth to distribute their content (though this can be avoided if you combine this model with peer-to-peer, which we'll talk about later)
This is was the main model for music and movies in the 2000s, though it's been in decline over the last ten years as people have moved to centralized platforms like Youtube. Bit-torrent is probably the most popular peer-to-peer system today, but other examples that were popular several years ago are kazaa and edonkey2000.
While the directories and publisher-hosts model made content producers the content hosts, the peer-to-peer model makes content consumers the content hosts. In the publisher-hosts model you had a particular computer on the internet that's associated with each piece of content, and your computer would connect to that computer to get that content. In the peer-to-peer model your computer is connected to some peers, which are other consumers' computers, and your computer sort of asks around to see who has the piece of content you're looking for. If anyone responds saying that they have a copy, you connect to them and download it. Then, crucially, if anyone else asks around for that content your computer will respond that you have a copy and they'll get it from you.
- The amount of network capacity available to host a piece of content scales with the demand for that content, ensuring that there's always enough capacity
- No one participant in the network is sending content to a large number of consumers (the burden is spread among a large number of people), so no one winds up with a big bandwidth bill
- It's very difficult to control, as we saw with attempts by the music and record industries in the 2000s
- Obscure content can disappear from the network if everyone who had a copy leaves
- Downloads can be slow if there aren't many people who have a copy of a particular piece of content
Publisher-hosts / peer-to-peer hybrid
It's possible to combine the publisher-hosts model and the peer-to-peer model to get some of the benefits of both. RSS feeds (like podcasts use) that refer to files in bit-torrent are an example of this. This model works basically the same as the publisher-hosts model except since consumers are sharing the burden of hosting the content you can still host your content on an old computer in your house even if it's being downloaded by millions of people. And since the publisher of the content keeps a computer running all the time making that content available it won't disappear if it stops being popular; consumers can always get it from them if no one else has it.
Click to read and post comments
Dec 15, 2018
These are the ideas that are currently stuck in my head and I can't get rid of:
Click to read and post comments
- Using border collies with computer backpacks to do local deliveries and other things that people are trying to build robots to do. The computer gives the dog directions like a kind of robot shepherd.
- Making new products out of e-waste
- Using the fancy supply chain techniques that enable drop-shipping to have final product assembly done by a large number of people at home. A modernized version of the 19th century putting-out system
- Peer-to-peer package delivery
- Payments and metering for mesh networks (my last post was about that)
Dec 12, 2018
There are a lot of places, even in developed countries like the United States, that don't have affordable broadband internet access. 35% of Americans don't have broadband at home, according to Pew.
Mesh technology can solve this problem by eliminating the requirement for big infrastructure providers to invest in an area before it can have affordable broadband. Mesh networks don't require any infrastructure to be built, and instead use connections between individual households to route traffic for others in addition to delivering data to the people in that household.
Added latency from mesh routing makes them only appropriate for last-mile connectivity. This means mesh networks need to be connected at at least one point to a traditional ISP. The main obstacle to adoption for existing systems like Freifunk is, in my opinion, the lack of any incentive for people to make these connections. Friefunk might be free but it is very slow for this reason.
The obvious thing to do, then, is to build a system to provide such an incentive. Here I propose a spot auction and payments system for people with connections to the wider internet to sell that bandwidth to other people within their local mesh network.
Such a system needs to allow:
- individual households within a community to connect to each other (mesh)
- people with internet bandwidth to advertise it for sale and for buyers to bid on it
- people who won bandwidth auctions to pay for the bandwidth they won
- the traffic that's routed through the mesh to look like normal internet traffic to websites, even if it exits the mesh at multiple points
1. Mesh connectivity
Gluon seems to have this part basically figured out. It uses BATMAN-adv for mesh routing and is based on OpenWRT which makes it easy to install on home routers. There are some moderately large (1000 nodes) Freifunk deployments using it.
As bandwidth consumers users set a maximum per-gigabyte price and a monthly budget for bandwidth. If the user has another source of internet bandwidth (say, an LTE or satellite connection), that maximum price would probably be the price that other source charges.
As bandwidth providers users specify the amount of bandwidth they have for sale (eg: the monthly cap they have with their ISP) and the minimum price they're willing to accept per gigabyte to sell that bandwidth (eg: the per-GB cost their ISP charges plus some margin). The router also does a speed test to determine the amount of bandwidth available per second.
To make both sides of this market work, the users' routers act as brokers, automatically bidding and accepting bids on the users' behalf in accordance with these constraints.
The commodity being bought and sold in this auction is a unit of bandwidth (say, 1 Megabyte) to be used within a specified time after the auction is won (say, one minute). From here on we'll call one of these commodities a bandwidth block.
Bidding works as follows:
- A bandwidth provider announces to all the other nodes in the mesh that they have a bandwidth block to sell
- Bandwidth consumers send bids to the bandwidth provider specifying how much they're willing to pay
- After a specified time interval (say, one second), if there are any bids that clear the provider's reserve price the bandwidth provider accepts the highest bid.
- Accepting the bid involves announcing to all the other nodes in the mesh who won the auction and what the price was, and waiting for an acknowledgement from the winner that they won. The other nodes in the mesh can use this information to inform prices they set for bids in the future.
- The winning bidder sends payment to the provider
3. Payment and settlement
Once a buyer has won an auction the buyer must pay the seller the agreed price. This requires some sort of payments system. Blockchain systems have unacceptably high latencies for rapid payments like this (in addition to their many other well-known problems). If you're selling bandwidth to be used within the next minute you can't wait more than a minute for the payment to clear (or anywhere close to a minute).
We need a central payments clearinghouse in order to make transfers of money happen within the traditional financial system.
Having to send a request to the central clearinghouse after every auction might be viable but it would be difficult to keep the latency of getting a transfer to clear low enough. Then we're in the awkward position of either having payment processing be a bandwidth bottleneck or inviting fraud by not waiting for payments to clear before providing bandwidth.
Bandwidth consumers can easily avoid fraud by not buying any more bandwidth from nodes that don't supply the bandwidth they sold. The price of a bandwidth block is low enough that being out one block isn't an issue. But bandwidth providers may end up providing a large number of blocks to a consumer before they realize they can't pay.
Instead I propose a system based on what I'm calling coupons. A coupon is a signed message from the clearinghouse that says, effectively, "The clearinghouse will pay to the bearer of this coupon on demand the amount of $X if delivered by Y date". Coupons have one important feature, which is they can only change hands once after they've been issued by the clearinghouse. Party A receives a coupon from the clearinghouse, party A transfers it to party B, and party B takes it back to the clearinghouse to cash it in. Only allowing the coupons to change hands once means we can detect double-spending without keeping any kind of ledger, and in the majority of cases without having to send anything over network in each transaction.
The way we enforce that coupons only get spent once is:
- Coupons are never valid for more than, say, one month
- Each coupon has:
- a unique identifier
- a face value
- an expiration time
- Each node keeps a list of all the coupon unique ids they've seen in the last month
- Each node keeps a bloom filter of that id list
- Each node periodically broadcasts their bloom filter to all other nodes in the mesh, along with the expiration time of the youngest coupon in the list
- Each node keeps all off the bloom filters that contain coupons that haven't expired
- If a coupon isn't in any of the node's bloom filters, it hasn't been spent
- If a coupon is in one of the bloom filters, ask the node whose bloom filter that is to check their full list in case it was a collision
- If that node says it was not a collision, reject the coupon
Using these coupons we can process payments with very low latency while still preventing double-spending.
From a user's perspective this is just buying and receiving credit similarly to how people buy cellphone minutes.
4. Routing and VPN
For this I propose using a version of the OpenVPN protocol extended with an additional packet type that lets the client move the session to a new client IP address when it switches providers. The clearinghouse runs this VPN gateway and pays for it with the transaction fees it charges for issuing coupons.
Click to read and post comments
Apr 05, 2018
- I want to buy something from someone, which requires sending them money
- I don't know them, and they don't know me
- They're too far away for me to walk over and hand them cash (or rice or a goat or whatever)
- Neither they nor I have a credit card or a bank account
- I can't rely on having access to a central broker or ledger. Even if I had access to such a thing I wouldn't trust it. For example, I might live in a place where all the central banking systems are controlled by the government and have problems with graft and rent-seeking.
In other words, I need an actually distributed payment system. Bitcoin won't work for the obvious reason that it relies on a single ledger that every participant in the system can read.
So, can we solve this? Well, first let's see if we can solve an easier sub-problem and then build up from there.
- I want to buy something from someone, which requires sending them money
- I know the person I want to send money to
- Because I know them they have a pretty good idea of how creditworthy I am
- We're phiscally close enough that I can hand them cash
- I can't go over to their house and hand them cash right now, because I'm busy
I think the solution to this is pretty obvious. If they trust me enough they can accept an IOU where I agree to pay them in cash the next time I'm in the neighborhood, as long as I promise that this'll be within a certain time window (eg: a month). In other words, we're separating payment and settlement. I'm paying them for the thing now, that payment is taking the form of a debt on my balance sheet and an asset on theirs, and that debt will get settled at a specified time (unless I default). This is how a lot of the grown-up banking system works.
What if the person I'm paying is a computer?
They're not a computer, but this blog-post is building towards an automated payment system that people don't have to think too much about, so we'd like for things like creditworthiness to be determined automatically. One way to automatically determine my creditworthiness is to use Bayesian updating. Let's define creditworthiness mathematically as the inverse of the probability of default. We start with a prior probability of how likely people we know nothing about are to default. Then, when we extend credit to someone, we update that probability for them based on whether they paid or not. The more of the time they pay, the more confident we are that they will pay if we lend money to them again. If the probability of default is above some threshold, we ask them to pay extra (a risk premium) so that, in a pool of risky people, the extra money we get from the ones who pay makes up for the money we lose from the ones who don't. If the probability of default is above some higher threshold we refuse to lend them money.
What if the person I'm paying doesn't know me?
If the person I'm paying doesn't know anything about me, then they don't know whether I'm going to give them the money or not. However, if they know someone who knows me, that person can "vouch for" me by passing along their estimate of my probability of default. The person I'm paying can get probability estimates about me from everyone they know who has them, take an average of those estimates weighted by how reliable the person that gave the estimate is, and use that as the probability I will default. And if someone they know doesn't know me, they can ask the people they know about me in turn. This will get information about me if it exists in the network even if it's far away. This is similar to the web of trust concept in cryptography, and to this paper in p2p file-sharing networks.
This system has a bootstrapping problem: what do you do if no one in the network knows you? I don't have a good general solution to this, but since anyone can generate probabilities of default any way they want, you could have participants in the network that acted like rating agencies. If you want to join the network you can pay them, and they'll use outside information to provide a probability of default. Since you're paying them they have an incentive to give you a good rating, but that incentive is balanced by the reliability rating the rest of the network places on them. If a rating agency consistently gives good ratings to people who end up defaulting the rest of the network won't count their ratings for very much.
What if the person I'm paying is too far away to deliver cash to them?
It's likely that if I don't know someone, someone I know knows someone who knows someone who knows them. By "knows" I mean both knows and is physically close enough to hand-deliver cash to. Another way of saying this is that it's likely that there's a path in the network between me and the person I'm transacting with. If that's the case, then all we have to do is find one of those paths, have each party along the path pass the debt on to the next step in the path, and pay each of them something to make it worth the default risk. This works as follows:
- I ask everyone I know to transfer a certain amount of money to a certain recipient
- If any of them knows the recipient, they send back a bid for what fee they want to charge to make the transfer.
- The ones that don't know the recipient send the same transfer request to everyone they know, and send back a bid which is the lowest bid they got plus some premium for themselves. Because steps 2 and 3 are recursive they amount to doing a search across the whole network.
- I get back bids from the people I know and pick the lowest one.
- The information that I accepted that bid travels along the path in the network it represents and reaches the person I want to pay.
What about systemic risk?
Remember the 2008 financial crisis? That was partly caused by banks borrowing money with the intention of paying the loans back with money that they were owed. Enough banks were doing this that when a few big banks defaulted, the banks they owed money to couldn't pay their debts, and this created a chain reaction of defaults. The standard solution to this in a traditional banking system is for a state regulator to require that banks only borrow a certain percentage of the money they have in cash, so even if a large percentage of the loans they made go bad they'll still be solvent. Doing this requires being able to verify the amount of money that people have. I don't know how you'd do that in the sort of system I'm proposing, especially because the system doesn't say how settlements should happen; anything that's acceptable to both parties is acceptable to the system. So I don't have a good solution to this. Suggestions welcome.
With the pieces I described above we have a sketch of a system that solves the problem posed at the beginning. This system is completely peer-to-peer; every participant in the system has the same status as every other participant. It works over large distances while still being based on physical currency. It doesn't require everyone in the network to have information about everyone else. It doesn't have a central ledger.
Click to read and post comments
Apr 20, 2017
"Federation" is a word people use when they mean interoperability in the context of internet communication. It means that you can talk to people who aren't in the same network as you. If I use fastmail and you use gmail, I can still email you.
Most social networks aren't federated. If I'm mutuals with you on twitter that doesn't mean I can message you on facebook, and you don't see my instagram posts unless I explicitly cross-post them.
There's a pretty obvious incentive for companies that run these networks not to federate. Their primary asset is their users and the network effects that keep them from leaving. If the only social network you use is facebook and I want to read what you post online I need to make a facebook account and read your posts on facebook. Once I've done that facebook can show me ads, generating revenue for themselves, and show me other content to try to get me to spend more time on facebook. If I could follow your facebook account from twitter, facebook would have no way of making money off of me, and I would have little attachment to facebook.
Those same network effects also make it difficult for people to start new social networks. A user gets nothing out of a social network unless there are other people on that social network to talk to. If a new social network can't interoperate with existing social networks then it has to have a pool of users who want to talk to each other on day one. This is hard to do.
You can also try to piggy-back on top of one of the larger social networks by making it easy for your users to share content on those networks and having links back to you. But this is awkward because each of the larger networks knows that you're a potential threat to them, as they were to networks that came before them, and they also know that at the moment they're in a position of power over you. So as soon as it looks like you might be growing in a way that could cause a problem, they change the rules. The feed ranking algorithm changes. Your media embed or autoplay priveleges are revoked. The way your feed items are displayed is changed so they're not as prominent. The larger network usually justifies these changes as combating spam and obnoxious content, and they partly are, but there's also a clear competitive incentive behind them.
The result of this difficulty is that there aren't that many social networks. If you were on the internet in the early-to-mid 2000's you probably remember phpbb and vbulletin forums. These forums typically had tiny numbers of users, maybe in the thousands, and there were a lot of them. Because the software that powered the forums was easy to modify each one was hacked in some way that made it a little different from the default installation. Custom moderation tools would get built in response to things that happened on the forum; tools that only make sense in the context of the people who use that particular forum. In-jokes, memes, and local flavor would get added to the software. Features would get built that catered to things people did on that forum which were different from what people did on other forums.
Contrast this with networks like facebook, twitter, or snapchat. A centralized design team decides what is or isn't worthy to get built for everyone who uses the network. This means that they have to cater to the average of all their millions of users. Of course everyone is different from that average in some way, so really that means that it's an equally awkward fit for everyone.
If there's a critical mass of users that use social networks that other networks can interoperate with, then the micro-localization (localization down to the level of a social clique, not just a country) of the old forum model becomes viable again. The problem of getting a critical mass of users goes away because your users can talk to the critical mass that already exists in the rest of the federation.
Various attempts at this have been made, with varying levels of failure. identi.ca was the first one that I know of (unless you want to count old-school things like Usenet). identi.ca evolved into gnusocial, which evolved into qvitter and postactiv. mastodon is a new social network that's interoperable with gnusocial and friends with a slicker, more dynamic interface. Diaspora is probably the server that's gotten the most press, though it's only interoperable with other Diaspora installations.
I'm interested in this now because mastodon is the first of these that enough people in my social circle started using for me to get any use out of it. It's still tiny; as of this writing there are fewer than 400k mastodon accounts registered on all public servers combined, and I don't know what the number of MAUs is but it's going to be significantly less than even that.
By the numbers I shouldn't care about this thing. It is growing exponentially at the moment, but that's only been happening for the past month or so, and it could run out of gas at any point (that's just how these things are). But there's something about being able to fix the house you live in that makes you want to do it, even if you're never going to make that investment back.
And I think micro-localized, custom-tailored social networks are something worth having. It's fun for tech-savvy people in developed countries to be able to customize their social media, but it's absolutely essential for the countries that are just now joining the internet. As I said in this post, taking facebook and just translating the strings isn't going to cut it when you're in a country that doesn't just speak a different language but has different cultural assumptions and norms. Nigeria, Vietnam, Bangladesh need their own social networks the same as China, Russia, and Japan have their own social networks. If history repeats itself those networks will be isolated islands cut off from the rest of the world.
If Nigerians prefer Nigerian VK over Facebook (and Americans prefer American social networks over Nigerian VK), and Nigerian VK allows federation, then we'll be in a new equilibrium where it makes more sense to federate than it does not to, because there are big pockets of users that you can't capture even if you're the biggest player.
Do I think that will happen? Not really, no. It didn't in China, Russia, or Japan, and history repeats itself more often than not. But the fact that it could happen is making me exited about the internet again in a way that I haven't been in five years.
Click to read and post comments
Jan 15, 2017
From this twitter thread, which people seemed to like.
Click to read and post comments
a lot of the way our technology works is because it was developed with department of defense research grants
the "technology industry" (read: internet) was never about technology, it's about developing new markets
Milton Friedman was a lot farther to the left than his popular image would suggest
there's a huge amount of value to be unlocked from getting people to stop buying new products and to buy used instead
telepresence is a psychology problem, not a tech problem. we don't know what a representation of another person has to have to be seamless
nobody's done anything really new with a computer since around 2010
globalization has pulled hundreds of millions of people out of poverty, and it can pull a billion more out
if there's going to be a "next big thing" in terms of hardware it's going to be Raman spectrometers integrated into phone cameras
the single best thing the government could do for the economy is making it low-risk for low-income people to start businesses
since you can get a consistent investment return higher than inflation wealth will consolidate unless the government intervenes
there's more funding for cancer research than there are good opportunities to spend that funding so spending more money doesn't help much
the current arrangement of american firms designing products and chinese firms making them only has a few years left
the only sector that the "sweatshop" image still applies to is textiles; electronics has been very automated for years
what culture you grew up in, what language you speak, and how much money your parents have matter more for diversity than race or gender
most online ad spending that isn't going to google or facebook is advertisers experimenting with new things and is temporary
sneakernet is seriously underrated
moving lifestyle good consumption from physical to digital goods would be hugely beneficial to the environment
ground robots with locomotion systems that can go up and down stairs make more sense for urban environments than drones
alphago is just a souped up version of td-gammon from 1992
most people don't want immersive media experiences; they want to be able to have IRL conversations at the same time
augmented reality should go back to its airforce roots and try to provide "situational awareness" tools (letting you know what's behind you)
q-learning is a lot more useful for ad targeting than it is for robots or playing video games (which is why google/facebook are funding it)
SQL is great. I love SQL.
nuclear is safer than coal even if you ignore climate change
driving a car is borderline immoral
regulatory capture is a big enough problem that liability is a more effective way of imposing regulation than professional regulators
all of the "machine learning"/"algorithms" that it's sensical to talk about being biased are rebranded actuarial science
despite the fact that on paper the web (as a platform) did everything wrong it's faster to build things in than any other major platform
making robots look/act like humans is silly. if you're making a robot it's because you don't want a person, you want a thing that does a job
capital gains from appreciation on the value of land (not improvements built on land) should be taxed at 100%
inheritance should also be taxed at 100% (though this would be hard to do because people would just put their money in trusts)
compound interest is extremely powerful. debt is someone else's compound interest that you're paying for.
nobody would borrow money on a credit card if they actually understood how much it cost
internet connectivity is probably never going to be cheaper than it is now
there are very few applications where computing power is a constraint, so moore's law doesn't matter much
tech company obsession with "culture" is cultish and creepy
someone needs to rethink textile manufacturing from the ground up so that it's automatable
Jan 12, 2017
I got into a discussion about epistemology on twitter (this is a lot more fun than it sounds), and that made me want to write down my epistemological framework. So here it is, in five bullet points:
- There is a black box ultimate reality, but we have no way to directly observe it
- We have indirect observations that we get through our senses, augmented with tools we make
- Since the only tools/senses we have to observe reality are contaminated with humanness, and the thinking tools we might try to use to minimize the contamination are also contaminated with humanness, we can't say anything about ultimate reality
- The job of science is to make models that take observations of the present and the past and predict observations they weren't given
- There are two quality metrics for models: accuracy and usefulness. Accuracy is the inverse of the difference between predictions and observations (eg: RMSE). Usefulness is the number of times someone asks the model for a prediction for something other than to test the model.
This is influenced by American Pragmatism (Dewey, William James, Popper, Rorty) and statistics (especially George E P Box).
Click to read and post comments
Aug 23, 2016
I've seen a lot of attempts to explain neural networks to general audiences lately, and I've been disappointed with the amount of mysticism in them. People talk about "brains" and "artificial intelligence" as though deep learning were summoning some dark force from the neterworld to rise up into our plane of existence and take all our jobs.
But neural nets aren't magic.
Neural networks, like all machine learning techniques, are a way to write computer programs where you don't know exactly all the rules. Normally when you program a computer you have to write down exactly what you want it to do in every contingency. "Generate a random number and put it in a memory cell called random_number. Print 'Guess a number' on the screen. Make a memory cell called user_input. Read the characters the user typed until they press enter, convert that to a number, and save it in user_input. If user_input equals random_number, print 'Good job', otherwise print 'Guess again'."
Everything has to be spelled out precisely, and no background knowledge can be assumed. That's fine if you (the programmer) know exactly what you want the computer to do. If you're writing a piece of banking software you know exactly what the rules for tabulating balances are so programming is just a matter of writing them down. If you're doing something like route planning you might have to think a little harder about what the rules should be, but a lot of smart people have spent a long time thinking up rules that you can look up on wikipedia. All is well.
But what if you want the computer to do something where nobody knows the rules, and the best minds have failed in their attempts to come up with them? Machine translation is a great example. As much as you might admire Chomsky there still isn't a complete system of rules to parse and translate between languages, despite decades of research.
So, being a pragmatic engineer with computing power to burn, what do you do? Well, while you might not know what the rules to translate text are, you do know how to write a computer program that can tell whether another computer program is able to translate text. You write some software that goes out on the web and slurps down millions of web pages that have been translated into multiple languages, and pulls out all the text. Then, you write a program that tests a translation program by giving it text it saw in one language, and comparing the translation program's output to the translation of that text from the webpage. The more of the words that are the same between the program output and the human translation, the better the program is doing. (This is a massive oversimplification and sentence alignment is really tricky but let's ignore that for the purposes of this post.) In the jargon this test program is called a loss function.
There's another thing that works to your advantage. While computers might be simplistic, they're blidningly fast. And what's a technique that gets you a solution if you're simple but fast? Trial and error! So you need two pieces; you need a way for the computer to make tweaks to a broken program, and you need a way for it to keep the tweaks that made it work better and throw away the ones that made it work worse.
The algorithm for doing that tweaking and checking is called Stochastic Gradient Descent. You represent your program as a massive grid of millions of numbers (called a weight matrix). You can think of it as a giant grid of knobs arranged in layers, where each layer is wired into the next one. Input goes into the bottom layer, and signals flow through until they get to the top, which is where you see the output. The positions of the knobs determine how the input gets transformed into the output. If you have enough knobs (and you have millions of them) that grid could do some very complicated things.
You start by setting the knobs to random positions. You run the loss function on the knobs and get a score. You make a random tweak to some of the knobs, run it again and get another score. You then compare the new score to the old score. If the score is better (I'm saying "better" and not "higher" because lower scores are better and that's confusing), you tweak more in the same direction as last time. If the score is worse, you go in the opposite direction. In math terms you move iteratively along the gradient of the loss with respect to the weight matrix.
You have your computer (or computers, as the case may be) make many many many tweaks and if you're lucky you wind up with a program that does what you want.
That's really it. Neural networks are just computer programs that computers come up with by trial and error given a testing program written by a programmer. No magic here.
If you want a more in-depth discussion of neural nets that goes into the math I highly recommend this lecture.
Click to read and post comments
Jul 10, 2016
Bad Reasons to Start A Company
- You want people to like you / want to be in charge of people / think you're a "leader"
- You think you're smarter than everyone else
- You think you have some technical "secret sauce" that no one else has and that this has intrinsic value
- You think your PhD thesis is a product
- Mummy and Daddy are giving you a seed round because they want to get you out of the house
- You live in LA or New York and you're jealous of San Francisco
- You heard that $fad (Big Data/IoT/Adtech/Fintech/whatever) was big
*handwave* machine learning
- You want to write code and not talk to people
- You want to change the world
Good Reasons to Start A Company
Click to read and post comments
- You know someone with purchasing power at a big company or government who has a problem you can solve and who is willing to pay you real money to solve it. Building the product is essentially doing consulting for them.
- You know someone on the CorpDev team at a big company who says something like "we'd really like to buy a company like X but there just aren't any we can buy"
- You see a market niche that either nobody else sees or everybody else undervalues (eg: because you have experience that's unusual among the sort of people that start companies), and you can build something yourself that fills this niche. Note: don't look for a technical cofounder, become one. Even if you only learn a little of the tecnical stuff it will make finding people who know more than you much easier.
- You made something as a hobby project that a lot of people are using, and at least some of these people have money
- Someone (either a VC or the government) is willing to give you money to screw around, and you have nothing to lose. That someone should not be family, because you might not be on speaking terms with them in a few years.
- You started doing consulting, and then you started hiring people because you were getting more work than you could do youreself, and one day you realized that you were running a company.