PHP 6: Pissing in the Wind

Posted: 2013-01-26
Category: PHP

PHP is well known for having an inconsistent API when it comes to PHP functions. Anyone with an anti-PHP point of view will use this as one of their top 3 arguments for why PHP sucks, while most PHP developers will point out that they don't really care. This is mostly because we're either used to it, have a god-like photographic memory or our IDE handles auto-complete so it's a moot point. For me I'm not too fussed because I spend more time trying Googling words like recepie (see, I got that wrong) recipe than I ever spend looking up PHP functions.

Another big thing that anti-PHP folks laugh about is the lack of scalar objects, so instead of $string->length() you have to do strlen($string).

ANOTHER thing that people often joke about is how PHP 6.0 just never happened, because the team were trying to bake in Unicode support but just came across so many issues that it never happened.

The Obvious Answer

There is a single way to fix all of these issues in a single blow. I'm by no means the first person to think of it, but it blows my mind that it's not being worked on.

PHP 5.x

$foo = "string";
echo strlen($foo); // Outputs: 6
echo $foo->length(); // PHP Fatal error:  Call to a member function length() on a non-object

PHP 6.0

$foo = "string";
echo strlen($foo); // Outputs: 6
echo $foo->length(); // Outputs: 6

PHP 6.1

$foo = "string";
echo strlen($foo); // Outputs: 6 /w a PHP Deprecated: use String->length()
echo $foo->length(); // Outputs: 6

PHP 6.2

$foo = "string";
echo strlen($foo); // PHP Fatal error:  Call to undefined function strlen()
echo $foo->length(); // Outputs: 6

Unicode Support

PHP 5.x

$foo = u"string"; // AHH WHAT IS THIS?
echo strlen($foo); // PHP Warning:  strlen() expects parameter 1 to be string, MADNESS given
echo $foo->length(); // PHP Fatal error: Call to a member function length() on a non-object

PHP 6.0

$foo = u"string";
echo strlen($foo); // Warning: strlen() expects parameter 1 to be String, UnicodeString given
echo $foo->length(); // Outputs: 6

If you want to get super detailed, people concerned about UTF-8 or UTF-16 support could even do:

$foo = u"string";
$foo = u16"string";

This shows that the language would default to UTF-8, because thats what most people default to when they give a shit about Unicode support, but gives extra super-powers to those who need UTF-16.

So why isn't this happening?

As I see there are two major reasons. One is: who is going to do it?

One core contributor Nikita Popov who shares this view is currently working on a proof of concept. We've never spoke and I'm not claiming anything, he just seems to share a common opinion, that this is an obvious next step for PHP which avoids breaking any BC while standardising function names in one fell-swoop.

Well, if a core PHP contributor is working on it, that means its happening right?

Nope, which is my second point. Let's put this into context.

PHP Property Accessors Syntax

This was an absolutely wonderful RFC proposed to PHP which by the reactions of many PHP developers looked like a shoe-in.

It gave us the exact same logical getter setter controls that C# offers, and which Ruby has something pretty similar. I was excited. Lots of people were excited. Then it got blammed by a 33 for and 21 against vote.

Sadly its not a majority wins situation, it had to get a 2/3rds majority. So we got fucked. No getter setter syntax for us.

Class Name Resolution via "class" Keyword

This is a handy little addition to the language that means you can take any variable and append $foo::class to get a fully resolved class name. This means when you're trying to use call_user_func on a method of a class you don't need to piss around with strings or get_class(), which is lovely.

As this is only a little feature only a few votes were needed. I recognise these names as active bloggers, contributors, or people otherwise known as active in the community.

The Little Point

When it's a little feature, whoever is interested in getting it voted in - as long as a reasonable number of active PHP guys agree - it's going to get in. That means a trivial/small feature always has a good shot as long as it makes sense.

But, if you even try to change any sort of syntax on a large scale you need to get a majority. Sadly it seems most of this majority are not the sort of people who vote unless they're asked to vote. It almost seems like they don't really care unless they are asked to care, and when they are asked the response is more often that not "nay".

Really, look up the votes for recent RFC's and see who said "nay" on Getter/Setter, they nay vote a lot.

The Bigger Point

I know in my heart that democracy is mostly a good idea, in the same way that communism started off as a really good idea, but when you have a large number of people making decisions that don't really give a fuck then the people really are not being represented as they should. How often has Rasmus said he prefers proceedural code over OOP? Of course plenty of people are voting against drastic improvements to the OOP functionality of PHP, because the core devs can't even decide if PHP is going to be OOP, functional, or whatever!

Another Point

People not being able to get on the same page is one thing, but I heard a reason from a "nay" voter who I'm going to leave nameless (mainly because I have forgotten his name) said that merging the getter/setter syntax would require too much maintenance. Right, doing stuff means doing stuff and that is an unfortunate fact of life, but if you don't like doing stuff: quit.

I quit the CodeIgniter development team because I was no longer doing client work with CodeIgniter and had no interest in helping CodeIgniter recode itself to put it into a vaguely competitive position against modern frameworks.

Now, while we don't need PHP to "win", it would be nice if we could get some sort of progress on the problems that have obviously plagued the language for the last decade.

The suggestion I'm making (and that plenty of others have made) are not particularly complex. They require some recoding of core functions, but for developers it would not require a recode of their applications for the foreseeable future.

Basically put, these suggestions won't break shit. Legacy developers can stay on 5.x from now until the end of time, and they could even upgrade to 6.0 for forever too. If they upgrade to 6.1 they'll start getting deprecated errors (which they can turn off) and when they get to 6.2 maybe they'll have problems - but 6.2 will probably drop in about 2020 so who even gives a damn?

Summary

I'd love to see this change happen. It's going to take effort, and I'd love to be able to help, but I suck with C (beyond making some robot fighting game in college) so I'm out of the picture. Seriously, while im a Pull Request or STFU kinda guy the last person that should be doing this sort of change is me.

People who give a shit (like Nikita Popov) need to be working on this, and people who think its a good idea need to get on board. People with no opinion should have a little look at how the majority of PHP developers are using PHP these days.

It's not just a language for noobs, juniors, idiots, designers who like if statements and other bottom dwellers to pedal shit. It's a serious language with known defects, used besides that fact to make some impressive systems (and WordPress). PHP runs almost 80% of the internet and as such it has become a haven for people who want to spread their code to as many users as possible, and is not just the shit-storm so many people pretend it is. PHP 4 was a piece of shit, PHP 5.1 was ok, PHP 5.2 was reasonable, by 5.3 it got awesome and 5.4 and 5.5 are adding to it so hard.

Let's keep making it brilliant, not so PHP can win some imaginary competition, but so the people who make distributed applications can continue to not only achieve the objective of "Getting onto as many servers as possible" but also "not be forced to write retarded code because that is all the language is capable of doing". There is a middle-ground, and this one change would handle so many of the problems that PHP suffers from.

Making changes to this language should not be blocked just because a quiet minority of the core team don't like the idea of being asked to do stuff.

Comments

Gravatar
Lukas

2013-01-26

Note I am not an expert on this, but this is what I remember from the discussion back then. The entire UTF-8 vs. UTF-16 isn't so easy. PHP6 was using UTF-16 as this is what ICU used. Going UTF-8 means searching for a new lib. Now IIRC the Drizzle guys are indeed working on such a lib. Just pointing out that things didn't happen back then not because some core developers prevented obvious advancements but because its non obvious. Plus obviously all of this will increase memory usage.

Gravatar

2013-01-26

Lukas: Doing more things takes more time. Fact. If there is no standard library for UTF-16 then maybe we can just skip that whole situation. Right now we have no core UTF support, so adding UTF-8 with u"string" does not seem all that unreasonable. Then maybe later on we add in u16"string" which uses more memory for those that want it.

Let's not try and solve every problem that everyone might ever have and worry about it being too many things, let's solve on problem at a time.

Gravatar
Jelmer

2013-01-26

I agree with most of what you say but I also think it's missing one point that was made by Anthony Ferrara recently on the core mailinglist: PHP doesn't have a vision. The current organic growth and random feeling outcomes of the democratic process is because nobody has a clue of the direction to be taken. There's no big plan, nothing that's being worked towards. The closest there is was said by Rasmus, it came down to: what we have is good enough.

I'm afraid that's the amount of vision and direction PHP can expect for the future. Unless the core devs decide on a vision for the future (or at least a general direction), the staleness of PHP will endure. And it's conservative attitude may eventually cause its demise when new developers start looking to more innovative languages.

Gravatar

2013-01-26

Jelmer: I completely agree, and this is why I linked to Anthony's post and lead into this saying the article is pointless.

"What we have is good enough" so you can sneak in some little changes that are kinda nice, but any major improvements can go fuck themselves.

Gravatar
Florin Patan

2013-01-26

There's more to it that this.
While I lack the C skills, like you've said, there's a whole bunch of things that could be added but they aren't:
https://wiki.php.net/rfc/namedparameters - functionality of PHP
https://wiki.php.net/rfc/ast_based_parsing_compilation_process - PHP itself
https://wiki.php.net/rfc/lemon - PHP itself
https://wiki.php.net/rfc/grisu3-strtod - PHP itself

Look at the dates on some issues and it will give you an idea about some things that happens in the internals.

Even if we don't know C, there's yet another problem, which for PHP projects is unconceivable but PHP itself lacks, documentation for developers. There's next to 0 documentation for how one could work with the Zend Engine API, Zend Engine itself and any other internals of PHP.

I get it, they are just a few people that actually maintain PHP, that's bad considered the number of people that work with it but they can't expect to encourage cooperation when people that could help out are missing the basic documentation.

Rasmus recently said on the internals ML that if every organization past a certain point that uses PHP would also have a C programmer the things would be different. That's true but in most cases such a programmer wouldn't be cost efficient for the organization nor a organization could afford to have a developer not contributing to PHP for a long time just because the internals are not properly documented or, worst, changes like the ones I've liked are ignored.

My model of for getting PHP 6 is: do a major userland break for functions, OOP et all, take your time to improve things now that you know what you need from PHP (like opcode caches) and just stay on 2 years to do it if you need that time. Meanwhile just security/bug fixes should be enough if whole community would know that a major version would come. And I bet that if such an initiative would be started then some (new) contributors would be able to help as well. Your model is a bit better that mine in respect to breakage but it still should be done.

We, the PHP users community, should stop requesting new features from PHP for a time and request those kind of improvements. Having a better PHP engine would be good for everyone, including the core devs/maintainers.

Just my view on things.

Gravatar

2013-01-26

Named parameters are a perfect example, thanks for linking this back up because I haven't seen it in ages.

The conversation was essentially:

"Named parameters would be really useful!"
"Nah".

Good talk.

Gravatar
Michael Kimsal

2013-01-26

I'd bet dollars to donuts many of these 'nay' voters were also quick to say "well just don't use it" when their own pet projects were being voted on and criticized. When the shoe is on the other foot... Maybe there' still time for me to push my groovy-style getter/setter syntax now that the C#-style seems to have been sidelined.

Gravatar
Errorerrorerrorerror

2013-01-26

Couldn't agree more.

If they think PHP should not be about OOP them keep PHP 5 like it is and just apply security fixes, if they aren't using OOP they don't need anything new since they can hack "le wordpress home pages" without any problems. Let all the devs with this vision work on PHP 5 forever fixing every security issue possible and don't add any new features so they don't have that "painful extra work" they don't want. You don't even have to upgrade anything anymore, it's like heaven!

Now allow a group of developers, that are interested in making PHP 6 a modern language, to work on a new API that works more OOP like from the box. I hate myself for not knowing C enough to help, but I think I would hate myself even more if I had to work with Rasmus like creatures.

I would even suggest a new domain for PHP6 that isn't bloated with bad comments and outdated examples. Maybe even use "phptherightway" as the base for this. Like Florin Patan said, I don't care if it takes 2 years for the first "beta" but at least put something on the horizon so we know the language have a future.




Gravatar
Adam P.

2013-01-26

You gloss over a lot of complexity in saying "just let us use UTF8 strings". Sure, adding them internally might only be a significant undertaking. So let's do it.

Now what are you going to do with those strings?

Put them in a database? What's the support like for UTF8 in the client libraries? Do we need to wrap functions to call UTF-8 equivilants? What's the performance impact? How will this affect users using the default MySQL character set and not utf8? Will everyone need to migrate their databases?

Put them in a file? How will PHP detect the character set of files? Or is it up to every developer to migrate any textually stored data to UTF-8 on their own? Does PHP provide a flag on opening files to specify the character set to allow users to continue to work with plain ASCII files (probably required for any kind of realistic interoperability).

What about all of PHP's extensions? Who is going to audit and/or patch those? Does cURL support UTF8? Which ones need to? Do we need to wrap to call different ASCII/UTF8 functions depending on what the user passes in? What's the impact of doing this check all over the codebase?

What happens when a user passes a UTF8 string into an extension that doesn't support UTF8? Error? Convert to ASCII and try and replace non-representable characters with like-characters? Convert to ASCII and drop any non-representable characters? Convert to ASCII but only if all the characters can be represented, otherwise raise an error? What's the appropriate solution? Be lazy and let the user decide with a system-wide option? Have some sort of call-time syntax for specifying? What are the pros and cons of each with respect to compatibility, interoperability, etc, etc?

These are solved problems, but the solutions still need to be examined and implemented where appropriate. The change is not nearly as simple as "well just put in UTF8 and let us use it!".

Gravatar
Levi Morrison

2013-01-26

Disclaimer: I voted no on the property accessors RFC. I vote no on a lot of things.

I want to point out that not all people who vote 'No' do so just because they don't like an idea or progress being made in PHP. I know several people, myself included, who really like the idea of accessors in PHP but felt like this was not the right implementation. Is that bad? Not at all. We'd rather rework the implementation and figure out a better way.

This is the reason I have voted no on almost all of the recent RFCs. Do I like the idea? Yes. Do I like *this* implementation? No, I don't.

Gravatar

2013-01-26

I don't think im glossing over anything, I think you're using a lot of question marks to try to complicate a very simple matter.

Right now, my PHP applications support UTF-8. My database is UTF-8, my meta tags are UTF-8 and my forms even contain accept-charset="utf-8". Everything is UTF-8. If any of those things are not UTF-8, I get garbled characters.

My main question is why are we discussing what happens in my database? There are already plenty of optional ways to handle charsets, nothing here changes. Strings are still strings, nothing breaks for legacy applications.

We have mb_strlen() which will be the same as u"foo"->length() so this is also a non issue.

> What happens when a user passes a UTF8 string into an extension that doesn't support UTF8? Error?

Correct.

You really are making this sound awfully complicated. It's not.

Strings are strings. UTF-8 strings are a new type of object, which functions can support or not. They can be passed to (string) to be typecast into basic strings, which can convert them to the closest ASCII possible, much like changing a float to an int does the best it can.

If any extension is not happy supporting UTF-8 chars the developer can typecast the string, or use the charset conversion functions already available in PHP.

Gravatar

2013-01-26

Levi Morrison: Fair enough.

I would really like to see the improved proposal for attribute accessors, do you have the RFC kicking around, or a blog or something? Plenty of PHP developers are pissed that that RFC didn't make it - so seeing an improved one that might happen for PHP 5.6 would certainly apease that whole situation.

Gravatar
Levi Morrison

2013-01-26

As of now there is not another RFC for improved accessors. There has been some discussion on what were some blocks to it making it this time around and what we might be able to do to address those issues.

At the same time, I'm exploring completely different approaches and asking people what they like and do not like about them. Here's a 4 question survey of an attempt to use annotations as the delivery mechanism for accessors: https://gist.github.com/4628068 . (By the way, the feedback on that one has basically been: NO! PLEASE! ARGH! I personally really like the idea. Oh well.)

Gravatar
Wayne

2013-01-26

Your suggestion on how to implement Unicode I've posted to reddit dozens of times, here's my take on it:

http://www.reddit.com/r/PHP/comments/wp2po/phpinternals_pseudoobjects_methods_on_arrays/c5fmufq

The biggest difference between our two methods is that I would recommend *none* of the built in the functions/methods would support unicode strings -- so strlen(u"hello")) would be a type error. The problem with PHP 6.0 is the fact that it required a complete re-write of every single function everywhere -- it's a multi-year long project that's boring as hell and very little other things could be done while it was going on. It was doomed to failure from the start. Instead, we should just add a new unicode type (that by default is already incompatible with normal strings) and then add methods to manipulate them.

This would also improve programmer discipline when it comes to unicode -- you want to output it to the screen, then you need convert to a regular string with the encoding that you want. Internally, it doesn't matter what encoding is used.

(I'd like to comment more but I have to run)

Gravatar
Wayne

2013-01-26

> Levi: This is the reason I have voted no on almost all of the recent RFCs. Do I like the idea? Yes. Do I like *this* implementation? No, I don't.

Are PHP developers now gun-shy? For years, they've been ragged on for developing immediate solutions that cause problems in the long term. Now everything has to be *perfect* to make it in? I'm starting to think that *this* is the problem... nobody wants to commit to any design ideas anymore.

Gravatar
Lukas

2013-01-26

@Phil: Clarification: there is no lib for UTF-8 that I am aware of. ICU covers UTF-16, but going UTF-16 means a lot of memory overhead which is probably one of the reasons why PHP6 didn't fly.

@Florin: There is http://php.net/manual/en/internals2.php for documentation on the internals of PHP.

Gravatar
Lester Caine

2013-01-26

Do we really need another language? If python/ruby/C# are so good then why not use one of them instead and let those of us who are now wasting months re-writing PHP code to comply with the CURRENT list of useless changes with a system that we can produce improved projects on rather than having to re-test everything on the next batch of changes !!!!

I have perfectly good code doing a reliable job on machines that are now running obsolete versions of PHP simply because that software does NOT run reliably on later versions and there is no money or time to re-write what IS perfectly functional code.

Gravatar
Jonob

2013-01-26

I wonder when the time will come when php itself is forked because of too many die hards who don't/can't change.

Also, you have to kind of shake your head in amazement when you see something like "We don't see the real need for named parameters, as they seem to violate PHP's KISS principle. It also makes for messier code." Really?

Gravatar
Mike Wales

2013-01-27

Completely agreed. A lot of it comes down to PHP history and insistence on coddling and bottle feeding developers, like Adam's reply above. Does cURL support PHP? None of your goddamn business, are we discussing cURL? It doesn't support Excel files either. Why should PHP care what happens when I make a call to my database? It's not your lane! Hey, I have my blender on my network - why does mysql_connect() not turn it on?

Mind your own fucking business and just be a damn good programming language. You're being nicer than I Phil, I'd take the "let them burn" approach for 6.0 - with a guarantee that BC is broken and your app will require a rewrite.

Gravatar
Danny Kopping

2013-01-27

It seems like the simple binary expression of Yes/No to an RFC might be causing this tension. Excuse my ignorance here as this may already be happening, but what if voters could expand on their decision with a comment? Maybe there could be gradations of ascent, some with a positive score, some with a negative score, for example:

Hell yes - I really want this: Positive
OK - I'm not fussed either way: Positive
Don't like the implementation, but interested in the idea: Negative
The idea sucks, fuck it: Negative

I think that could add some context to voters' decisions, no?

Gravatar
Caruccio

2013-01-27

Seriously, why don't you all give up on php and move to python? It's easier, saner and has suppor for every imaginable lib/environment/system one needs. Please note I'm not trolling. I mean it.

Gravatar
Mike

2013-01-27

Personally, functional and procedural syntax is php's bread and butter from the start. The obsession with making it like any other language and making everything OO drives me nuts.

Sometimes I wish someone would fork php and remove all the object related stuff. Amazingly I bet it would decrease an immense amount in size, become much simpler, and not lose *any* functionality (except some of the new date time stuff that I think only has OO options, I believe)

I agree with forcing syntax changes with placeholder functions that throw BC warnings for a couple point releases or just one major release. Gotta get rid of the cruft at some point.

However, if anyone is interested in porting php back to be functions only, no namespaces, nothing complicated, make needle and haystack consistent, function names consistent and cleaner, etc.. Tweet @mike503 :) just think - code that is portable to almost any version, no APC issues due to the complexities of creating OO on top of a procedural base...

Gravatar
Mike

2013-01-27

Personally, functional and procedural syntax is php's bread and butter from the start. The obsession with making it like any other language and making everything OO drives me nuts.

Sometimes I wish someone would fork php and remove all the object related stuff. Amazingly I bet it would decrease an immense amount in size, become much simpler, and not lose *any* functionality (except some of the new date time stuff that I think only has OO options, I believe)

I agree with forcing syntax changes with placeholder functions that throw BC warnings for a couple point releases or just one major release. Gotta get rid of the cruft at some point.

However, if anyone is interested in porting php back to be functions only, no namespaces, nothing complicated, make needle and haystack consistent, function names consistent and cleaner, etc.. Tweet @mike503 :) just think - code that is portable to almost any version, no APC issues due to the complexities of creating OO on top of a procedural base...

Gravatar
Orion Blastar

2013-01-27

I think the goal of PHP should be to make it easier to develop for it so that more people can use it.

I see the changes proposed, but it would shut off legacy code and cause legacy PHP code to not longer work and confuse the people new to it. This is what happened when Microsoft changed Visual BASIC to Visual BASIC.Net and people demanded Classic Visual BASIC back, and then Visual BASIC became a 'toy' language as people moved on to C# instead.

I think people are voting 'nay' because it makes legacy PHP code no longer function, is there a way to keep the old way and add in the new way? Have two ways of doing things so that legacy PHP code still runs, and can easily be modified to the new way if needed?

I understand the need of UNICODE strings to support many different foreign languages so that PHP apps aren't just limited to Latin languages, and it would allow a greater globalization of web sites and source code into other areas of the world. But you still need the old strings for systems and libraries that don't support UNICODE yet. Why not simply introduce a unicode string data type that has the option to convert to the older string type?

Gravatar
Thiago Belem

2013-01-27

Phil, do you give me permission to freely translate this article (to portuguese) and post it on my blog?

I think the brazilian community need to keep one eye on this kind of stuff.

Gravatar
Florin Patan

2013-01-27

@Lukas: I know about that page, but it's not nearly enough to get you properly started into say adding features/fixing bugs and so on.

I've asked on #php.pecl about links from where should I start learning about the internals and all I've got was that there's some tutorial written in 2005 on zend.com that still matches a little bit. There's a book about it from the same time with the same 'sort-of matches' as well and a link to a wiki where there's a bunch of external resources about php from various other blogs.

@All who say that people who don't know C should stop whining about features and such and pick up C and do it yourself.... Right, I could do that, it would take me about a year to be able to actually produce a viable patch, that is if nothing comes into my personal life. Even so, should I provide patches and so on, it's discouraging to see how other well documented RFCs get the 'nay' vote from people how don't participate actively into this or with reasons like: "- the current functions reflect the underlying functions used. I've been a PHP user (developer) for a couple of years now and I really don't care what's the parameter order for the functions that PHP use to help me out getting my job done" (from the response I gave on the mailing list).

Gravatar
Lester Caine

2013-01-27

I did backpedal at one point and was seriously looking at what security fixes were really necessary to be back ported to 5.2! Had I not spent so much time working on porting code from PHP5.2 to PHP5.4 and got stable 5.4 setup's on a couple of the servers I would have, That and the fact that the core code is not an easy area to work in. As others have said, documentation is very much lacking. I'm using PHP because having been programming in C++ in the 90's it's RELAXED way of working was a refreshing change. Ramming all of the crud that slows down C++ development into PHP is what is getting my goat, and noone is paying me to keep fixing code that is not broken in the first place! And now you want named parameters and type checking ...

Am I alone in wanting to retain the simple relaxed programming language that PHP used to be?

Gravatar
Jefferson

2013-01-27

Just curious, why hasn't anybody forked PHP? Does their license prevent it? (A quick Google search didn't reveal anything relevant.)

Gravatar
Sergey

2013-01-27

80% of the internet will not move to other language ;)

Gravatar
Tyrael

2013-01-27

"Just curious, why hasn't anybody forked PHP?"

because those who are always know better what should be changed are never capable of doing anything on their on, but demanding others doing it for them.

"Does their license prevent it?"

the PHP license allows it (althought you couldn't call it PHPsomething or somehingPHP): http://www.php.net/license/3_01.txt

to the topic:
I think the biggest misunderstanding is that the php userland community overestimates the size of the volunteers developing the php-src.
Based on the market share of the language itself it can be really surprising but there are only a bunch of active contributors any point in time.

Adding something to the language can be useful of a percent of the users, but usually it costs something for every user (memory consumption, cpu cycles, added complexity in the code, etc.), so adding stuff should be always done when the pros overweight the cons.
see http://en.wikipedia.org/wiki/Feature_creep for more details.
even those people who can come up with an RFC and a patch can disappear after they pushed their stuff into the core, and we can end up in a situation where the team has to support/maintain something what nobody else but the author understood perfectly (phar anyone?).

the other thing that because this is an open-source project done by volunteers, most contributors are focusing on either their own needs or the new shiny things, so the hard and boring stuff like implementing an AST based parser or adding proper unicode support gets neglected.

would be nice if people would start complaining and shittalking the ones doing the work, but ask themself what can they do to improve the situation (and I'm not convinced that writing witty blogpost will move anything forward).
'ask not what your country can do for you – ask what you can do for your country'

Gravatar

2013-01-27

Lester: I don't believe I've "shittalked" anyone in this article and I have no "demanded" anything, but thank you for calling it witty.

You act like I know nothing about open-source. I do not contribute to PHP directly but I run PyroCMS and have been a core contributor for CodeIgniter and FuelPHP. These projects taught me how it can be hard to maintain high levels of code contribution and fit that around work, hobbies and relationships. I get it, its tough - but we all have a reason for doing it and none of us are heroes simply for contributing to a project that we agreed to contribute too. If we weren't contributing then we should quit, because we're failing to fulfill the primary role of a contributor. See my point?

The trouble here is that people (users, and core developers) want features like this to happen, and some people just want no change. That is the crux of the issue, which I would love to see a solution to.

On to your final point. How can I help PHP?

Let's say - for the sake of argument - that I walk outside, buy a "C for Dummies" book and read that thing cover to cover. In a month I might be a reasonable C developer and after a few more months I might have a good enough understanding of how PHP works internally, fire my code into a feature branch post that RFC up on the board, then... what? If people are dismissing things like Property Accessor syntax after it's coded up by an experienced C and PHP dev, then why are they all going to vote yes on an RFC bashed together by somebody who is brand new to this?

That's not an excuse, thats logic. It's very possible that so few people get involved in core development because there is very little motivation to do so. "I can work really hard on something that I feel really passionate about, then have it totally ignored by a bunch of people who think PHP should just remain a cluster-fuck of inconsistent PHP functions." Yaaaay!

See the issue? It's very easy to say "Well why don't you do more?" but when people see those that try being met with negativity in such a big way it REALLY doesn't make me want to even try to help.

Gravatar
Jonas

2013-01-27

You speak of it as it were some big hurdle to overcome, but sorry, strlen isn't even on the map on important problems to solve for PHP to have Unicode.

The big problem with the last PHP6 design was that the PHP team decided to use some almost-UTF16 encoding which (a) nobody uses and (b) causes all kinds of problems with the metric ton of crap that PHP has to link to.

I do not share your opinion that PHP's namespace would be somehow cleared up if string lengths got to be a method instead of a function since it's not like the rest of the calling conventions adhere to this (and will probably never, because the language designer himself seems to oppose it strongly).

Finally, I'm not sure that Python-style u""-syntax for Unicode strings really is something to mimic. Separating the two types like that leads to all sorts of confusion when linking C code, and requires the programmer to constantly convert between the two. This is probably Python's biggest design flaw and a constant source of frustration. I'm not sure PHP could the converstion magically unless the developers would decide that from now on all i/o is in UTF8, in which case you could just switch to UTF8 strings and be done with it.

Better to look at Perl for inspiration, imho, because it is syntactically very close to PHP and solved this problem over 10 years ago. PHP was a poor clone of Perl 4 and never copied the improvements and cleanups Perl has had during version 5, but I think that most of them turned out to be very good design and could well be used for a new PHP as well.

Gravatar

2013-01-27

Jonas: I think you have misread or misunderstood this article entirely.

strlen() is an example of a random procedural function that I would like to see taken around the back of the woodshed and put down. Not because it in itself is bad, but because over time strlen() str_replace() bin2dex() just got silly.

I'll be the first to say it doesn't matter. It's inconsistent, and we're all big enough and ugly enough to work through it - but that doesn't mean it should never ever change.

Moving them to be methods, by itself, is nice. You never need to worry about where the haystack is, because the haystack is the object. At the same time we get to pick new method names for ALL of them. $foo->length() and $foo->replace('foo', 'bar') is nice and simple, right?

THAT is what would clean up the global namespace, because a few versions later those can be deprecated then removed, just like the mysql_ functions - which are terrible.

THEN I started talking about Unicode, which you seem to think was rolled up into the above point.

Right now PHP has ASCII characters, and you can convert to and from various different encodings with random functions like utf8_encode(). Right? If you take a UTF-8 character and shove it into a field that is not UTF-8 you're going to have a bad time. That's how the browser and database work, and its how I'd expect PHP to work.

Now you might suggest you could do this: $string = u"Foo" or this: $string = utf8_encode("Foo");

The latter might be more "consistent with PHP" but would involve entirely repurposing an existing function. I don't know why it wouldn't be syntactically ok to prepend with a u, and have this be a "UTF-8 String Object", which could extend "String Object".

I don't know why UTF-16 even became part of the PHP 6 suggestion in the first place. It's slower, more CPU and memory intensive and is not even that popular amongst the sort of people who are using PHP to build stuff. So... maybe have it as an "UTF-16 String Object".

All of these objects can share toCharset('UTF-8'); toUtf8() style methods, to convert to ASCII, UTF-16, ISO 8859-1, whatever.

I'm not saying its going to solve everything, but I am saying switching to use objects for strings makes the entire language more consistent, cleans up the garbage in userland, and makes unicode support easier going forwards.

Gravatar
Joe Watkins

2013-01-27

Lets get one thing straight. Writing a project in PHP is nothing at all like contributing to the sources, writing in PHP and writing in C are roughly a billion miles apart. You likely have no idea what it is like to contribute or work on or with code you do not understand, that's the difference. With the exception of literally a handful of people there is nobody working on PHP that could do anything they or the community wanted, there's nobody with even close to that sort of knowledge. PHP is driven forwards by volunteers and dedicated documentors, if it weren't for them giving you their spare time, you would have no project, you would have no opinion.

You should be a bit more sensible when forming your opinions, perhaps, if some highly qualified souls have deemed a feature unnecessary, including the original innovator we owe most of PHP and APC too, Rasmus, then a feature is actually unnecessary, no matter the opinions of people not even able to understand the code in the patch in question, nevermind any other part of PHP. And afterall, if you really want accessors, then the code is out there and nobody is stopping you from using it.

You've already admitted you lack the knowledge to judge, in any useful or productive way, any of the decisions made, which annoys the people who spend their time contributing to PHP with code, ideas, time, or knowledge.

If you actually want to do something useful you don't have to write in C, you can edit documentation, try and replicate bugs, go on the internet and make useful educative posts for new users or the elderly, you will get thanked for that, possibly even some inner peace in it for you, keep making posts like this, no one will thank you for that, they'll just think your a total arse.

Gravatar
Lester Caine

2013-01-27

Phil
I started with PHP just as PHP5 reached RC stage so I skipped PHP4 and jumped straight in. At that time we were being offered the line that PHP6 would be following in quick succession and would address the Unicode problems with a 'clean' solution. The rest is history? We still don't have a clean multibyte character based core and more 'extras' are being bolted on the side of PHP5.?
We NEED PHP6 as the clean base that was being proposed 10 years ago, My only grip is with the fact that we don't have an LTS version on an older PHP5.? that we can work with while all the different versions of the new stuff are reworked yet again.
Rasmus has now pointed out what is needed internally as a next step and Zeev has offered to OS the code which could be a base for that, but personally I am not happy that is becomes PHP5.X - We need a freeze point and move on the PHP6?

Gravatar
Tyrael

2013-01-27

"I get it, its tough - but we all have a reason for doing it and none of us are heroes simply for contributing to a project that we agreed to contribute too. If we weren't contributing then we should quit, because we're failing to fulfill the primary role of a contributor. See my point?"
I see your point (the no voters are blocking the development, hence they should quit), and I also see why that logic is flawed:
Sometimes it is better to not add something, or not add it before it is ready.
The php project holds the backward compatibility in high regard, so adding something which turns out to be a bad idea(magic_quotes, register_globals), or adding a somewhat broken implementation(open_basedir, phar, traits) to the core can hurt a lot as we can just remove/fix it right away.
Some of those no voters are around the project for a long time and they are somewhat more rigid about making sure that it won't happen again.

"The trouble here is that people (users, and core developers) want features like this to happen, and some people just want no change. That is the crux of the issue, which I would love to see a solution to."
as Rasmus said in an talk at Etsy, when php was small, it was easy, as he literally had access to every box running php, so if he wanted to break BC, he could fix the code for everybody out there.
but that isn't true anymore, so we have to be more careful about introducing changes.
personally I voted against the accessor RFC, and I voted no, because there were too many changes in the implementation before the vote and I didn't wanted to accept such a risky change to the language right before the feature freeze for the 5.5 version.

"On to your final point. How can I help PHP?"
we have a bunch of areas where we one could help.
we have thousands of open bugreports at https://bugs.php.net/ where many of the bugs not real issues but documentation problems or bogus reports which could be sorted out without any C-fu.
simple things like trying out the reproduction steps and commenting whether the issue still there would be also helpful.
we always need people in the documentation and in the translation teams.
one can also help in the php.net redesign effort (aka. http://prototype.php.net/ )
qa is also an area where we need more feedback:
building php from source and executing make test and sending us the reports is the easiest way to help, and looking into the failed tests and trying to figure out whether the code or the test is wrong is a huge help.
following the alpha/beta/RC announcements and testing your application and reporting any problems also a great way to help, as that greatly reduces the chance that we got a release with a serious problem (like 5.3.7 for example).

"Property Accessor syntax after it's coded up by an experienced C and PHP dev"
Clint Priest didn't had any prior php internals knowledge before he started working on his accessor rfc, he was just smart and persistent enough to pull it through.

"then why are they all going to vote yes on an RFC bashed together by somebody who is brand new to this?"
iI depends on the idea. if you propose some hard/tricky stuff without a patch or at least understanding what you are talking about your chances are slim.
If you propose something which others can tell that it is technically feasible, and the idea is good, then somebody else will cook up a patch, this happened in the past a couple of times.

"That's not an excuse, thats logic. It's very possible that so few people get involved in core development because there is very little motivation to do so. "I can work really hard on something that I feel really passionate about, then have it totally ignored by a bunch of people who think PHP should just remain a cluster-fuck of inconsistent PHP functions." Yaaaay! "
We never ignore people, so I assume that by ignoring you mean the no voters.
That is something which can happen, and I understand why would the author feel bad about it.
If you remember the annotations vote that was exactly like this, but there were other cases where the person prevailed at the end: the short array syntax was rejected first, the finally keyword wasn't considered for years until laruence did a POC implementation to prove that it is possible (and it was/had to be almost entirely rewritten after the successful vote...), or the traits rfc where first nobody really liked the idea, but Stefan was persistent and convinced the majority (and the implementation here also had major problems, which was only addressed after the production release).

unfortunately it is a chicken egg problem: you can't ask people to vote before seeing the implementation and you also don't want to implement it, if it will be rejected later.
but I think that there is no better way than putting the burden to the one who wants to change the status quo.

"See the issue? It's very easy to say "Well why don't you do more?" but when people see those that try being met with negativity in such a big way it REALLY doesn't make me want to even try to help."
some people give up without trying, but I don't see any other way where we could do this.
we can't promise that we will accept your idea before seeing what you propose and we also can't implement something for you if we don't convinced that it is a good idea.

we can argue about that whether the proposer or the voters know better what do the users want, but in the end the responsibility of the change and maintaining the code will owned by the core devs, so you can't shut them out from the vote.

Gravatar
Nick Rawe

2013-01-27

@Phil: I think you've got a good handle on the current attitude- there is a widening divide. PHP has never been a general use language and as such really hasn't been designed to handle some of the things we now get it too- I think back to the first website I ever built with PHP and it was something along the lines of a single HTML page with some Date/Time information embedded into it; that is how some people still see it or think that that is the only way it should be. Nowadays, I'm writing console apps with it!

I think we're in a unique position here as the users of a language. You look at some others and they start general purpose and add support for HTTP or User Interface bindings whereas PHP started as a focused language with a focused approach and has gradually tried to go the other way and be *more* general purpose. Because the barrier to basic understanding and installation is so low (who can remember buying a shared hosting package that _didn't_ have PHP on it?) lots of people start to pick it up and then we get the progressive users who want the nicer syntax like accessors or (blurgh) annotations. At this point, as a core dev, who do you invest your limited time in supporting?

Being a bit of a nerd for this kind of stuff I look through Github on a regular basis just to see what PHP projects are new or gaining momentum and anecdotally speaking I've noticed a trend towards this more advanced usage of the language (in some cases pushing it way past it's original parameters...) and there seems to be somewhat of an expectation by userland developers that PHP should move to meet this which would in some cases make people reconsider their language of choice (if there are major BC breaks etc) or potentially raise the barrier to entry- both things which could take away PHP's current position of near de facto web scripting language.

As such it is at this point really that one has to look at PHP for what it is and ask themselves a question- can I live with the drawbacks or do I use another language? Not as a trolling thing, but coming from the later position I don't see another language I'd personally want to use: Python is nice but I'm not a massive fan of it's OOP model and although it is installed as part of the Single Linux Spec there's extra set-up required; same basic issue with Ruby, coupled with some syntax things; C#/ASPX MVC means installing Windows on a web server which is a little bit like heresy to my mind ;)

As such the only option I personally have is to accept and work around some of PHP's "limitations" (or missing "niceties" depending on how you flip the coin) and if new features make the cut then it's a bonus. One can always pop their heads above the parapet and propose an idea if they have time, but perhaps there could be an amendment to the RFC system by proposing an idea first and then implementing if it gets core dev approval; then you aren't wasting your time.

Just my two pennies!

Nick

Gravatar
Joe Watkins

2013-01-27

For clarity, this was not meant as offensive, as I read it back it, it reads something other than was meant:

You likely have no idea what it is like to contribute or work on or with code you do not understand, that's the difference.

What I meant was, when you contribute PHP to a project you can understand all of it, but when anyone other than literally a handful of people contribute to PHP itself, they do not, and cannot understand all of it, in this setting it is very hard to be useful, it takes a lot of time and effort ... I wasn't saying that contributing to PHP is easier, or questioning your abilities at all, I hope that is understood.

I imagined, like you, that there a bunch of guys doing everything and that everyone knows about all the crucial parts, this is just not the case, it's very hard to know enough about anything at all to actually be able to integrate your knowledge into an existing, mature system, written in any language.

Sorry, I didn't mean to come across offensive, my words were in the wrong order is all ...

This is to you, personally, feel free to put it on your blog, but I wanted clarity for you, personally.

Gravatar
Tomasz Sawicki

2013-01-28

I've been reading PHP internals ML only for a month now, but I've already come to conclusion that I should better not do that. One could wonder how PHP got that far, but on the other hand, it's more clear why PHP is how it is. I love PHP because it works everywhere and its scripting nature greatly speeds up development.
Anyway, here's my thoughts on things that could be improved to make PHP internals development more productive and discussions less stressful.

1. Vision
PHP needs The (One Unambiguous Documented) Vision. Either be it simple programming language or OO all the way or something in between. It's that or vote-driven non-vision leading to unproductive debates and PHP being more of a Frankenstein's monster.

2. Voting
Major RFC proposal MUST contain a reference to how it's consistent with The Vision.

"No" vote MUST be commented with a real reason. It's something that proposer and others can discuss with, it's difficult to discuss with just "No" and explaining the vote on internals ML is currenlty not mandatory.

If there's a patch available, its implementation details MUST NOT be taken into account when voting. Voting and discussion on implementation should be done somewhere else (Pull Requests?). That way, good ideas won't get rejected because the patch author was poor PHP internals coder.

Voting members list should be periodically reviewed to remove those who do not actively contribute in any way to PHP anymore. Maybe it's done already, but I'm not aware of this.

3. Core contributors
PHP internals documentation should get as much attention as PHP language documentation to make it easier to make new people even consider learning it.

Gravatar
Zeev

2013-01-28

Phil,

The PHP project has a strong bias for simplicity and a relatively strong bias for retaining compatibility. It was that way from the early years, ever since it took off.

I think there's a big misconception about why ideas get rejected. While some may vote nay because they're worried about the implementation, my position is that the implementation is very much secondary - what matters is impact on language complexity. That's why there's a 2/3 majority needed for new language features / changes.

I voted against the property accessors RFC - not because of the complexity of the patch - but because I think it introduces syntax that very few people care about, but that would complicate lives for everyone who will now need to understand it. I know Matthew and Ralph from my team voted in favor of it - and they would have loved to use it in Zend Framework - but I prefer that the end users that end up having to debug their code when it breaks, won't have to learn yet another feature with dedicated syntax and behaviors.

People complain that PHP has no vision. Can you point me to languages with 'vision'? I suspect that the pace in which language changes make it into PHP is actually quite high compared to other languages. Changes to C are decades apart, and I don't think we're slower than Java. FWIW, I think most innovation should move to the frameworks and extensions layer, and we should leave the language syntax alone.

Now, most of your post was about Unicode, which was an entirely different case.

First, the Unicode effort was spearheaded by Andrei Zmievski - a *very* capable developer. Clearly, his abilities were never the issue. The issues were, as Lukas pointed out, the fact that the decisions we took collectively produced a product that was twice as slow and consumed twice as much memory (4x lower density), while breaking compatibility for roughly every app in existence that's more complicated than Hello World. We might have a high bar, but we thought that wasn't an adequate successor to PHP 5.

If someone comes up with a better idea that's feasible, I'm sure it'll gain traction. We'd probably also be willing to put up with some level of compatibility breakage, and a mild level of slowdown & increased memory consumption. We'd love to have a Unicode-enabled PHP. We're less enthusiastic about turning PHP into the everything and the kitchen sink of languages. People who are extremely unhappy with this position have a wide selection of other languages they can choose from, and it's great that we all have that choice.

Gravatar
Simon

2013-01-28

Current state first example: http://3v4l.org/1aPNg second example: http://3v4l.org/C86Ls

Gravatar
Wayne

2013-01-29

@Zeev:

How did the magic methods get included by property accessor's can't? See, we already *this* feature in a way that's ugly and hard to debug but end-user's still want to use! Property accessors really take this very same idea but make's it organized and easy to understand. Having more syntax isn't necessarily more difficult, in fact having this spelled out as syntax would make it much easier than the current situation. It seems you believe the default stance of "do nothing / change nothing" is actually better for end users by default but I think you need more argument than that.

PHP6 was an **impossible** project -- I'm actually surprised to see that you still think it's possible to add unicode to the entire language and all libraries at once. No matter how good the coders are on that project, that's something that is just doomed to fail. Any attempt at unicode has to be done in a way that it's part of the regular PHP release cycle -- nobody is going to wait 2 years for the project to be done.

Gravatar
Tyler

2013-01-29

Here, here! Broken window theory comes to mind... Let us get on this. I'm horribly under experienced to contribute to the source, but I'll gladly test.

What if we made a github repo to document proposed changes as a collective and then presented it to The Powers That Be?

Gravatar
Nick Patch

2013-01-30

@Phil: UTF-16 is a very sane option to implement Unicode strings in a programming language and is used for Java, JavaScript, and all .NET languages, as well as systems like Windows and OS X, and also the ICU library. I'd counter your claim that "It's slower, more CPU and memory intensive", because it's much less complex than UTF-8, especially when most of your characters are in the basic multilingual plane (BMP), which includes all the 16-bit UTF-16 characters (up through code point U+FFFF) and doesn't require surrogate pairs. It definitely takes more memory than UTF-8 when representing the subset of ASCII characters (double the memory), but the overall processing requirements are much lighter, and therefore less CPU intensive and faster. In most cases (and this depends somewhat on the characters in your string), UTF-8 is the most CPU intensive and least memory intensive while UTF-32 is the least CPU intensive and most memory intensive. UTF-16 is a good compromise.

Here's an article on the topic:
Unicode Technical Note #12: UTF-16 for Processing
http://www.unicode.org/notes/tn12/

In defense of UTF-8, although converting between UTF-8 and UTF-16 is fast, not converting at all is fastest and almost all web developers (and most developers in general) desire UTF-8 input and output these days. Many programming languages use UTF-8 to internally implement Unicode strings, including Perl and Python, with great results. Perl is the language considered to have the most feature-rich Unicode support.

Ultimately, I don't have a strong opinion on whether a language implements Unicode strings internally as UTF-8, UTF-16, or something else entirely optimized for the language, as long as the language has core Unicode strings with functions, methods, and/or operators providing Unicode semantics. As an internationalization engineer, this is my top requirement in a programming language.

Gravatar
Jorge Pereira

2013-02-04

For those thinking PHP should just stand still and remain as is, that's the path to becoming a niche or legacy language. I still remember when most projects on SF.net were Perl. But that's the path for languages that refuse to evolve.

I understand you have code to maintain, but the entire community cannot hinge on that. So just stick to PHP 5.2 and be happy and comfortable, but PLEASE let the rest of us move on. If development grinds to a halt, you'll basically end up with a language which is dead in the water. It seems obvious the language has to evolve. Those who wish otherwise are dragging everyone else back for their own selfish reasons.

But worse yet, many of these syntax changes, they're OPTIONAL. There is no reason why they'd have to break code compatibility. They merely represent better patterns for programming discipline and promote code readability, as well as paving the way for enhanced parallelism. That's generally not just an opinion, it's been the matter of over a decade of evolution, which has seen its way to other languages.

So, it's not about specific features anymore. It's about making a choice, and setting a VISION for the future of the language. And unless someone has that vision, PHP will eventually die out and be replaced by something else.

Gravatar
Jorge Pereira

2013-02-04

The property accessor discussion is a great example. People refer to it as a "syntax change" or "something that people would need to learn".

The importance of this feature is that it enables you to keep using properties, but have side-effects on read/write operations. By NOT having this feature, we're forcing people to create wrapper set/get functions, which is just not very smart. Regardless, the thing it that there IS a need for this, so by voting against it rather than amending it, we're delaying the improvement.

Having VISION is different from a mission statement. Having a vision is about making a plan for how you will be successful in the future, reacting to changes in the world, trends and mindset.

And please, don't compare to Java, which had an aged syntax by time it came out (95). It could have broken ground by adopting innovative characteristics like those sported by Delphi a few years earlier (93, lead by Anders Hejlsberg, who decades later went on to work on C#).

PHP has the community, the relevance and the opportunity to avoid becoming irrelevant, and that can only happen by allowing itself to evolve, rather than try and remain loyal to a syntax born in the eighties.

Gravatar
Neeke Gao

2013-02-04

pissing in the wind.
More and more looks like JAVA.

Gravatar
Anthony Steiner

2013-02-26

Holy crap! Who would have known that this topic would be such a religious debate. :| <- sarcasm

Agreeing with @Jorge Pereira;
PHP is evolving, it's that simple. Extending the analogy, evolution sometimes takes a good long while, and even then it's not always immediately accepted once it's reached it's apex. That is just the nature of things. Though I've always pegged developers as pragmatic thinkers and can see the benefits of the process and adapt.

@Everyone else:
Lazy UTF duck-typing is a feature not a language requirement and memory consumption on PHP is negligible even in an enterprise architecture so why argue over a fraction of a point?

Also accessors in a language that supports the OOP paradigm... that happens to be STRONGLY-TYPED is expected (Java, C#, Python... sorta, etc) . One of the best parts about PHP is that is gives you the option to be implicit or explicit.

Not really sure folks understand that adding one feature based from another language can be a nightmare.
Add Enumerations, Generic Collections, scalar metrics... now this is just transposing other languages into PHP.
A strongly-typed PHP scares me, but a PHP that supports strongly-typed architecture makes me happy. Funny how that works.

Last thoughts:
Let it evolve, but let it do it at it's own pace, lets not force other languages down the process's throat.



Gravatar
Asdfghjkl

2013-03-11

Sir, It is a great Post. I would Like to share your Link as a trackback on my Blog. Can I . It would be great if you allow me to do so.

Gravatar
A Concerned Bystander

2013-03-24

At the end of the day, this article is just fucking right.

The language needs to be forked; not only the source, the intent.
The language needs a vision; I never said mine, just a vision.
The intent needs to follow the vision, and the source needs to follow the intent.

There will always be people who just want to include or iterate something in an otherwise HTML file. That's just dandy, let them. We have 5.X.

To the rest of us, PHP is old, aging, yet immature; like some bizarre man-child who draws webpages. The only reason (so far as the numbers can tell me) that PHP is holding strong in the market, is because one-click-install disasters like Wordpress are at the disposal of every dog-walker and face-painter out there. I've lost confidence in PHP. I grew up on PHP, and (foolishly) learned too much from it. I've had to relearn. It wasn't fun.

Let them have Wordpress. Let them have 5.X. That's fine, that's just fine.
I'm walking away because PHP is immature. I want PHP++ (or PHP#, or PHP%, whatever)

Posting comments after three months has been disabled.