That's a good question
and I think
one of the
principles from functional programming,
actually is defer all the things
to the last possible moment
and it works out very well in practice,
because, like if you are throwing away information
and you're doing random things,
the deepest levels of your code,
it make things harder to understand
and harder to change later
and if you pull those decisions out,
if you pull the effects out
and push them to the outer edges of the program
it gives you a lot more flexibility
and it makes the code easier to reason about and so forth
and there's a similar principle in architectural design.
I think Slamdata is
the essence of that
attitude of deferring things to the last responsible moment
and what I mean by that is,
if you choose to be Isolution that's based on Spark
there's a lot of risk associated with that decision
and the reason is Spark is just
one analytics framework
to come along in the past 10 years
out of probably 50
and yet I know it's reasonably successful
but if you choose BI platform based on Spark,
you as a CIO are making a bet
that Spark's gonna be around 10 years from now
and it still gonna be supported and maintained
and it's gonna be the latest and greatest
and in my opinion,
history shows that
that is
very, very risky.
Before Spark there was MapReduce
and everyone thought MapReduce would be the end-all,
be-all of
computational
number crunching power
on Hadoop platforms
and of course that worked out spectacularly poorly.
Everyone who bet on MapReduce now has Legacy code bases
that
take hours or days to run,
that aren't well supported
by any of the major players in Hadoop
and that are written in these totally awkward,
cumbersome fashions
that are extremely difficult to maintain.
If you were a CIO, who bet on MapReduce,
as being the future of your
computational power of...
tower of number crunching capability,
then you bet wrong
and now you're gonna spend the next 10 years
paying down that technical debt
and it's the same way with Spark.
Spark is
a computational ejip
and BI tools that are powered by Spark
they work well, I mean reasonably well now,
but in my opinion,
history has shown us that
there will be something else after Spark,
whether that's Flink,
or whether it's Pachyderm
or whether it's
DataFlow or any one of the other numerous...
I mean literally dozens and dozens
of competing technologies out there
that are saying,
Spark got this wrong,
Spark got that wrong,
Spark got this wrong.
Let's do it this way, not that way
and Spark is unlikely to be able to be
the last computational framework
that is ever invented and adopted by industry.
It's likely to be just a next one
in a long line of successors
and so BI technology that's wedded to Spark
is hugely risky in my perspective
because it will be obsoleted
and the question is not if, it's when
and we've taken totally different approach,
which is we're not going to tire technology
to a specific number cruncher.
So we can take,
using our
core technology,
you take an L-Nix work flow
and we can compile it down to MapReduce,
that's just a connector, like Big Connector.
We can compile it down to Spark.
We can compile it down to whatever comes after Spark
and all of that technology is open source.
It's the Quasar Analytics super compiler project.
It's a 100% open source.
It's gonna be around forever
and what it allows you to do
is state what you want to do
but not how you want to do it
because what you want to do is probably not going to change.
You know you want to do the same sorts of analytics
now that you want to do in five years
and 10 years and so forth
but how do that is gonna depend on,
which of these projects ends up supplanting Spark
and in all likelihood, it's not going to be one,
there'll be lots of different technologies
for lots of different use cases
and Slamdata's approach
allows us to effectively support all of them
instead of wedding it,
like you know what Zoomdata has done with Spark,
wedding their technology to a single computational engine,
we said no, we're gonna defer that
and we're gonna support
all the different ways of approaching data.
We don't care, just bring in new data sources,
bring in new number crunchers or analytics engines
or databases or APIs and we don't care.
We're just gonna build the best compiler out there
and we're gonna make sure
that when some new source comes along,
we can support it in ideally a matter of days or weeks
by just smashing that cost down
and by putting all the smarts in our compiler technology.
Không có nhận xét nào:
Đăng nhận xét