The Kepler space telescope has scanned the Milky Way for years, watching for the telltale dip in brightness when a planet-sized object crosses in front of a star.
Its dataset is a great playground for machine learning systems: noisy and voluminous, with subtle variations that could go undetected by simple statistical methods or human scrutiny. A convolutional neural network is just the trick to tease out new and interesting results from that morass.
As is so often the case, though, the AI has to follow a human example. It was trained on thousands of Kepler readings already labeled and verified as planet or non-planet, and learned the patterns that astronomers are interested in. This trained model was what ended up identifying Kepler-90i and Kepler-80g.
The researchers write that they hope releasing the source for the project will help make it more accurate and also perhaps allow the work to continue at a faster pace or be adapted to new datasets. You can read the documentation and fork the code yourself over at GitHub.
Microsoft today announced its AI Platform for Windows developers, a new set of tools that will soon help developers to bring the machine learning models they trained in the cloud to their desktop apps. The AI platform in Windows 10, which will launch with the next major version of Windows, will make use of the GPU on your local machine and allows developers to run their models in real time and without the need for a round trip to the cloud.
“What we are building really completes the story for Microsoft from an AI perspective,” Microsoft Partner Group program manager Kam VedBrat told me. In the past, Microsoft talked a lot about its machine learning infrastructure in the cloud and the tooling it built around this. With this, developers can now easily build their models in the cloud, using their framework of choice and then easily integrate these models with their desktop apps, using Visual Studio and some of the other tooling Microsoft is building for this.
At the core of this is Onnx, a project that is backed by Microsoft, Facebook and Amazon. It allows developers to convert Caffe2, PyTorch, CNTK and other models into the Onnx format to move them between frameworks as necessary.
Microsoft will also allow developers to build image recognition models with the Azure Custom Vision Service and export them for use in Windows ML. Unlike working with traditional framework, a developer doesn’t need to know about the intricacies of building machine learning models to do this. All they need to do is give the service their tagged training data.
These models then make use of the silicon that’s available to them in any given machine, which most likely means a DirectX 12 graphics card or, if that’s not available, the CPU. But the platform will also offer a flexible API for accessing other hardware, including future Intel Movidius vision processing units, for example.
The advantage here, Microsoft corporate VP Kevin Gallo told me, is not just lower latency and increased privacy for your users’ data, but also cost. Running these models in the cloud does, after all, incur a cost that can quickly add up. When they run on the desktop, though, that’s a non-issue.
Starting with the next preview of Visual Studio 15.7, developers can simply add an ONNX file to their Universal Windows Platform (UWP) apps and Visual Studio will generate a model interface for the project. Microsoft will also make tooling for previous versions of Visual Studio available and it’ll add this capability to the Visual Studio tools for AI, too.
Google Lens, the company’s visual search engine that can recognize what’s in your images and scan business cards, among other things, is now rolling out to all Google Photos users on Android. This marks Google’s first major expansion for Lens, which was previously only available to those who had access to the latest Pixel phones. On those phones, Lens is also available through the Google Assistant, but that feature isn’t rolling out to all Android users yet.
Google promises that Lens in Google Photos will roll out to iOS users “soon,” but it’s unclear when exactly this will happen.
Lens can be both frustrating and quite useful — though it never feels indispensable. When it works, it works really well. And while you probably don’t need Lens to tell you that you are standing in front of the Eiffel Tower (unless you are really jet lagged), the fact that it can show you more information about sights, including opening hours, is actually quite useful (though you could just as well do a quick search in Google Maps, too).
The ability to scan a business card is pretty useful, though, unless, of course, you’ve done away with business cards a long time ago and just use LinkedIn anyway.
Personally, I haven’t found much use for Lens so far. It’s a nice parlor trick but it’s easy to forget it exists. Over time, though, it may just get good enough that it’s easier to take a picture of a landmark or restaurant to get more information than searching for it with a keyword.
To realize that the background check industry needs an overhaul look no further than the backlog of 700,000 background checks faced by the federal agency that handles all background checks for sensitive government positions. This backlog has essentially rendered background checks useless, as many agencies are able to give security clearances on a temporary basis before a background check is even started.
Intelligo is an Israeli company trying to make background checks relevant again by using AI and machine learning to not only speed up and automate the process, but also run more thorough checks.
Launching out of beta today, the company has raised $6.8M to date – a seed round of $1.1M and a Series A of $5.7M. They boast investors like Eileen Murray (Co-CEO of Bridgewater Associates) and advisors like the former director of the NSA Michael McConnell and former Managing Director of the Israel Ministry of Defense Pinhas Buchris.
Currently most serious background checks are done manually. This means that when an analyst creating a report comes across a new data source they need to decide if it’s worth taking the time to parse it and add it to the report. Consequently, many important sources like social media pages and news sites are left out of reports. It also means that background checks can take up to a week or longer, which is frustrating for the company and applicant.
Alternatively, Intelligo’s solution is primarily driven by an automated machine learning platform that can indiscriminately look at all thousands of data sources without concern for how much manual labor it will take. Reports are also provided in a user-friendly interactive dashboard, which is a stark contrast to the dozens of typed pages that an old-school background check will be.
Automating the process also dramatically costs down on cost – Intelligo says their prices are half of the average market price, which is allowing small and midsize businesses to now get the benefit of a high-level background check that typically would only be used by a larger corporation.
The startup also offers an ongoing monitoring product designed for the investment world. Funds often want the ability to monitor their portfolio companies and management teams even after the initial due diligence process, and by using an automated platform Intelligo can let let funds know of management issues long before a human would find the source of the issue.
Google researchers know how much people like to trick others into thinking they’re on the moon, or that it’s night instead of day, and other fun shenanigans only possible if you happen to be in a movie studio in front of a green screen. So they did what any good 2018 coder would do: build a neural network that lets you do it.
This “video segmentation” tool, as they call it (well, everyone does) is rolling out to YouTube Stories on mobile in a limited fashion starting now — if you see the option, congratulations, you’re a beta tester.
A lot of ingenuity seems to have gone into this feature. It’s a piece of cake to figure out where the foreground ends and the background begins if you have a depth-sensing camera (like the iPhone X’s front-facing array) or plenty of processing time and no battery to think about (like a desktop computer).
On mobile, though, and with an ordinary RGB image, it’s not so easy to do. And if doing a still image is hard, video is even more so, since the computer has to do the calculation 30 times a second at a minimum.
The network learned to pick out the common features of a head and shoulders, and a series of optimizations lowered the amount of data it needed to crunch in order to do so. And — although it’s cheating a bit — the result of the previous calculation (so, a sort of cutout of your head) gets used as raw material for the next one, further reducing load.
The result is a fast, relatively accurate segmentation engine that runs more than fast enough to be used in video — 40 frames per second on the Pixel 2 and over 100 on the iPhone 7 (!).
This is great news for a lot of folks — removing or replacing a background is a great tool to have in your toolbox and this makes it quite easy. And hopefully it won’t kill your battery.
For IBM Watson CTO Rob High, the biggest technological challenge in machine learning right now is figuring out how to train models with less data. “It’s a challenge, it’s a goal and there’s certainly reason to believe that it’s possible,” High told me during an interview at the annual Mobile World Congress in Barcelona.
With this, he echoes similar statements all across the industry. Google’s AI chief John Giannandrea, for example, also recently listed this as one of the main challenges the search giant’s machine learning groups are trying to tackle. Typically, machine learning models need to be trained on large amounts of data to ensure that they are accurate, but for many problems, that large data set simply doesn’t exist.
High, however, believes this is a solvable problem. Why? “Because humans do it. We have a data point,” he said. One thing to keep in mind is that even when we see that evidenced in what humans are doing, you have to recognize it’s not just that session, it’s not just that moment that is informing how humans learn. We bring all of this context to the table.” For High, it’s this context that’ll make possible training models with less data, as well as recent advances in transfer learning, that is, the ability to take one trained model and then use this data to kickstart the training of another model where less data may exist.
The challenges for AI — and especially conversational AI — go beyond that, though. “On the other end is really trying to understand how better to interact with humans in ways that they would find natural and that are influential to their thinking,” says High. “Humans are influenced by not just the words that they exchange but also by how we encase those words in vocalizations, inflection, intonation, cadence, temper, facial expression, arm and hand gestures.” High doesn’t think an AI necessarily needs to mimic these in some kind of anthropomorphic form, but maybe in some other form like visual cues on a device.
At the same time, most AI systems also still need to get better at understanding the intent of a question and how that relates to individuals’ previous questions about something, as well as their current state of mind and personality.
That brings up another question, though. Many of these machine learning models that are in use right now are inherently biased because of the data with which they were trained. That often means that a given model will work great for you if you’re a white male but then fails black women, for example. “First of all, I think that there’s two sides to that equation. One is, there may be aggregate bias to this data and we have to be sensitive to that and force ourselves to consider data that broadens the cultural and demographic aspects of the people it represents,” said High. “The flip side of that, though, is that you actually want aggregate bias in these kind of systems over personal bias.”
As an example, High cited work IBM did with the Sloan Kettering Cancer Center. IBM and the hospital trained a model based on the work of some of the best cancer surgeons. “But Sloan Kettering has a particular philosophy about how to do medicine. So that philosophy is embodied in their biases. It’s their institutional biases, it’s their brand. […] And any system that is going to be used outside of Sloan Kettering needs to carry that same philosophy forward.”
“A big part of making sure that these things are biased in the right way is both making sure that you have the right people submitting for and who these people are representative of — of the broader culture.” That’s a discussion that High says now regularly comes up with IBM’s clients, too, which is a positive sign in an industry that still often ignores these kind of topics.
Part of a new breed of tools that use network analysis and machine learning to respond to potential security breaches, Phantom Cyber had previously raised $22.7 million in funding from investors including Kleiner Perkins Caufield & Byers, Foundation Capital, the In-Q-Tel (the investment group affiliated with the Central Intelligence Agency), according to Crunchbase.
Following the acquisition, Phantom Cyber’s executive team will report in to Splunk’s head of security products.
“Sourabh Satish and I founded Phantom to give SOC analysts a powerful advantage over their adversaries, a way to automatically and quickly resolve threats,” said Oliver Friedrichs, Founder and chief executive of Phantom Cyber, in a statement. “Combining SOAR with the industry’s leading big data platform is a revolutionary advance for security and IT teams and will further cut down the time it takes them to eliminate threats and keep the business running.”
As cyber security threats increase — and become increasingly automated — overtaxed security teams inside companies are trying to automate their responses. Automation is also critical for companies since there aren’t enough cybersecurity experts to meet increasing demand.
This isn’t Splunk’s first foray into the security business. The company has steadily built up an expertise in the security market, first through its acquisition of Caspida for roughly $200 million in late 2015 to gain some expertise in real time threat detection and then last year with the purchase of SignalSense, a breach detection service, for an undisclosed amount.
In the future, Splunk expects Phantom Cyber to automate more than just security responses, the company said in a statement — anticipating a change that was predicted by the consulting and analysis firm Gartner earlier this year.
By 2022, 40% of all large enterprises will combine big data and machine learning tools to support and replace monitoring, service desk and automation processes and tasks, up from 5 percent today, the firm predicted.