PHP 8.0 alpha3 was released 23 july and netcurl has been tested together with this release. It turns out that a few changes has been made in the core of curl that enforced a minor patch in netcurl for it to work properly again. In short, it is about how curl is presenting itself in v8.0, which is no longer as a resource. It instead returns objects on curl_init, generating a CurlHandler, CurlMultiHandler, etc.
It has been planned for a long time now, and the sprint was set to be finished next year. But the project was smaller than I thought so the database connector was released yesterday. The purpose is not – again – to reinvent the wheel. It is used to autodetect best available driver on platforms it is being installed on. Instead of being forced to adapt, this driver is as netcurl; adapting to the platform. However, this version is limited to PHP 5.6 and above.
The packagist codebase has also, like many of the other packages used by composer via packagist, been moved to github to be better available on service distruptions that seems to be quite common on a bitbucket cloud. The codebase is however still self hosted at bitbucket.tornevall.net – so the github collection is just a mirror.
By all means, this also takes us a bit closer to the entire library/API we’re targeting.
8 days left in the planned sprint for release (and this was just a random set date at first).
netcurl 6.1.0 just released anyway! But the package has been up and tested for a few weeks already and sould be entirely backward-compatible with the 6.0-branch.
However, to be sure that the compatibility breaks are prevented (if you still uses old classes and calls anywhere), you should consider including tornevall/tornelib-php-netcurl-deprecate-60 into your installation packages as the choice has been made to not include the old name standards at all in the new release.
The entire structure actually. All drivers are separated into their own PSR-4 based modules and can be called separately. To use the old way to auto discover available drivers, MODULE_CURL has been replaced with NetWrapper, so that’s what you’re looking for in that case. If you are totally sure that you want to use specific drivers you can choose to do so.
Another thing that is new, is the way how you configure the drivers. There are a bunch of defaults that can be used by setting up a new WrapperConfig. This will be covered in the documentation soon. Also, the curl driver has been upgraded to support multiple calls in the same request (curl_multi). This has not been possible before. Also, mentioned below, the SOAP support has been upgraded. Amongst many features, WSDL-requests can now be done with or without cache (and keep the ability to detect authentication failures in both states).
With the above SOAP support, it was also decided to give WrapperConfig to set production/staging mode, where – if you initialize wrappers in staging mode – cached requests will be the default state for production requests. This will speed up things a bit.
Netcurl has also been divided into two separate packages, where the old network module is it’s own package as of version 6.1.0 – to make sure both packages are included in your projects it’s sometimes (to avoid redundant inclusions with composer) to just load tornelib-php-network as netcurl is included as a requirement in that package.
Most of the old drivers are from 6.1.0 removed, except for curl, streams and SOAP. As stream-drivers are included by default (with the binary-ready file_get_contents) I realize that nothing else are actually required. However, drivers can be built manually by the register functions if they are really necessary. In this release, however, an extra RSS parser are included that will use SimpleXML as long as Laminas RSS-parser isn’t included in the package.
Any questions or support related requests can be sent via the e-mail list or via firstname.lastname@example.org
For full release notes, you can take a look at the list below.
* [NETCURL-225] - NetCURL 6.1.0 EPIC
* [NETCURL-226] - PSR4 NetCURL+Network (Phase 1)
* [NETCURL-230] - WordPress driver in prior netcurl is lacking authentication mechanisms
* [NETCURL-246] - Pipeline errors for PHP 7.3-7.4
* [NETCURL-272] - getSoapEmbeddedRequest() - PHP 5.6+PHP 7.0
* [NETCURL-274] - Cached wsdl requests and unauthorized exceptions
* [NETCURL-200] - Confirm (by driver) that a driver is really available (update interface with requirements)
* [NETCURL-209] - Support stream, when curl is not an option
* [NETCURL-227] - Migrate: getHttpHost()
* [NETCURL-231] - Move out MODULE_NETWORK to own repo
* [NETCURL-232] - Make exceptions global
* [NETCURL-234] - Support immediate inclusions of network libraries
* [NETCURL-238] - Make setTimeout support ms (curlopt_timeout_ms)
* [NETCURL-242] - Disengage from constructor usage
* [NETCURL-243] - Reimport SSL helper
* [NETCURL-244] - Reimport curl module
* [NETCURL-245] - Reimport soapclient
* [NETCURL-247] - The way we set user agent in the SSL module must be able to set in parents
* [NETCURL-249] - setAuth for curl
* [NETCURL-251] - setAuth for soap
* [NETCURL-255] - Add static list of browsers for user-agent
* [NETCURL-258] - NetWrapper MultiRequest
* [NETCURL-259] - High focus on curl (rebuild from 6.0)
* [NETCURL-260] - Current curl implementation is only using GET and no advantage of Config
* [NETCURL-261] - Make sure setAuthentication is a required standard in the wrapper interface
* [NETCURL-263] - Add errorhandler for multicurl
* [NETCURL-264] - Avoid static constants inside core functions
* [NETCURL-265] - getCurlException($curlHandle, $httpCode) - httpCode is unused. Throw on >400
* [NETCURL-267] - On http head errors (>400) and non empty bodies
* [NETCURL-268] - Add Timeouts
* [NETCURL-269] - Proxy support for stream_context
* [NETCURL-271] - Synchronize with netcurl 6.0 test suites
* [NETCURL-273] - Support driverless environment
* [NETCURL-276] - Use a natural soapcall with call_user_func_array
* [NETCURL-277] - Netwrapper Compatibility Service
* [NETCURL-278] - setChain in 6.1 should throw errors when requested true
* [NETCURL-279] - Make sure setOption is useful in NetWrapper and MODULE_CURL or work has been useless
* [NETCURL-280] - proxy support for curlwrapper and wrappers that is not stream wrappers
* [NETCURL-283] - Use setSignature (?) to make requesting clients set internal clientname/version as userAgent automatically instead of Mozilla
* [NETCURL-285] - Reinstate Environment but in ConfigWrapper to make wsdl transfers go non-cache vs cache, etc
* [NETCURL-286] - Move driver handler into own class
* [NETCURL-287] - Initialize simplified streamSupport
* [NETCURL-288] - Output support for XML in simpler wrappers
* [NETCURL-289] - Open for third party identification rather than standard browser agent
* [NETCURL-291] - SoapClient must be reinitialized each time it is called
* [NETCURL-292] - Support basic rss+xml via GenericParser
* [NETCURL-293] - Try fix proper rss parsing without garbage
* [NETCURL-294] - Make it possible to initialize an empty curlwrapper (without url)
* [NETCURL-295] - MultiNetwrapper (+Soap)
When Facebook groups loses their administrators/moderators, the group will also suggest a new promotion for the group. Either you could choose yourself, or someone that you think could be “the perfect administrator”.
This is normally done with a notice in the right column on the screen (unless you’re not in the new design). By means, promote a new administrator can only be done if the entire group are missing a “team”. Tornevall Networks built a plugin that automates this promotion when a group is entered via Chrome.
It also, partially, can do this via Facebook group bookmarks. That part of the plugin isn’t perfect as it could crash if your group list is too large. Not due to the large list, as it opens one group at a time. This is still under bugchecking. However, the plugin has best effect on often-visited groups. You can find it in the link below.
Currently, there is a work in progress with the new Facebook design. We will come back to that later. And by the way, the plugin is “multilingual supported”. Groups can be promoted regardless of which language you’ve chosen and in a few different ways. As you might not want to promote yourself as an administrator in racist-groups, there’s a switch that allows you to accept before promotion.
Many of netcurl 6.1 components have been released this week. Dependencies that “we” use to make netcurl more efficient. Many things has been imported from 6.0, but instead of copy local code from the old project (with pride) all code has been rewritten from scratch. Having old code in a new project could probably be devastating.
So. The wheel has been built again?
No. Yes. Maybe. Nah. We’re just making the wheel better!
This “wheel” is a self contained project. As usual. I’ve said it before. The project is being built to be a simplified component between an idiot developer and the internet communcation layers. For example, by building new solutions a developer normally needs to reinitialize the tools he/she (hen) is using. If the developer is using curl, it has to be configured from scratch. And there are probably other solutons that do similar stuff to this module. But I don’t want it.
The primary reason of this is the fact that I need more. I need a parser that fixes all communications by itself. Without my interference. I say: “Bring me this website parsed” and the solution should bring it to me, regardless if it is SOAP, rest, XML, rss or a socket connection. Netcurl is the part that should decide this for me, and I should be ready on the other side to have it. And this is actually how curl itself works.
Compared to the military, this application is not supposed to ask “HOW HIGH?!” when someone asks “Jump!”. It should know how high it should jump before developers has even measured that value.
However, if the site speaks SOAP, you need an extra component. And probably, SoapClient resides in such solutions. Netcurl should handle this out of the box. Or xml. Or RSS. So here we are again.
netcurl 6.1 is being rebuilt, and you’d probably understand why if you look at the old codebase. It was supposed to transform into a PSR4 solution. But it failed. So there’s a lot of reasons of why this wheel is being rebuilt. And this time, I think I got it faster. For example – netcurl 6.0 (or TorneLIB 6.0) did not support anything but regular curl communications. If you wanted to fetch five different pages, you had to run the curl-request five times. netcurl 6.1 takes advantage of curl_multi calls, so you can push five different urls into same session and get them back from different handlers. Like it should be.
So what does back to basics mean?
Well. netcurl, or the prior name TorneLIB was actually built with war in mind. It was supposed to datascrape proxy lists, collect the proxies and then register them for use in DNSBL. In short, blacklist trolling haters on the internet. Then, work came in my way and I found out that netcurl had a perfect role in ecommerce project we did build (yes, we). As it supported both rest and soap in an environment where the library itself chose an available driver, it could handle much more than just warfare applications.
Time passed by and for a few weeks ago, corona said hellow. At this moment, most of us stayed home, so when the “real work” ended I could with no interruptions take care of netcurl. And here we are, almost done. And this time, the library should be able to handle it better than it used to.
This old project that was once born as a proxy scraping tool is alive again. Well, in fact it has been alive and idle in several years as the purpose of it did a big road change when I started to enforce implementation of it in ecommerce platform. The project showed up to be a great combo-mixer of communication tools since it had great failover possibilites. However, time changes and it needs more than this now.
My wish is to reinstate a proxy scraper, as this project was written in the early years of PHP 5.3 – and that tells a lot of what it was and what it now can become instead. As you can see on the left side, you could possibly figure out that the support of failovers are growing. This of course takes time to implement but as I just wrote, I believe that this must be done.
But that is not everything. By making the client more compliant with reality it could also be a part of another projects – like the network tools, as those tools won’t be much to have if there’s no data scraping available. For example, there was the “fnarg project” (more known as a part of the giraffe-project today, for a smaller amount of people), which was specialized on RSS-fetching. The fetcher was built so it not only fetched new articles. It also kept track of old, and if they was changed/edited over time.
All of this have forced me into a state that I’ve refused to be in, for several years now. But realizing that PHP goes forward, and not much backwards – this must be done before it’s too late.
The plugin rolls through a bunch of categories and (currently) site names. When triggered, the content will only get flagged. In this case (as long as it works) a post will say something like “hey, this shared URL is based on THIS”. So if we trigger on a satire site, we basically say “This is satire. Beware”.
I was happy with this behaviour until I realized that different categories has different trigger levels. I’d say “fascism by choice” is not the correct term for this sidepatch. It’s all about customization.
For example, if I normally want stuff to be flagged with a notification, this will also occur on data that comes from the category “rightWing“. That’s not good enough! So from now on, the json object will be handled multidimensionally, where the above json-block contains the default behaviour for a specific site. If I for some reason need to change this behaviour, I can do it either on site level or category level. Let me show this too, below.
One thing to note is that same rules for the actions also applies on description/descriptions.
"description": "Right wing politics.",
"nyheteridag.se": "Nyheter Idag",
"friatider.se": "Fria Tider",
"description": "Fake news and satire",
"storkensnyheter": "Storkens Nyheter (obsolete)"
"storkensnyheter": "Content on this site was considered fake news and made people angry."
The above example has a default action set to replace. While the site itself (facebook) has a setting that tells the plugin to notify the user on normal triggers, this default action will be attached to the rightWing category. In this case, if we trigger on storkensnyheter, the plugin will keep notifying me about “fakenews”. But if friatider is triggered, the detected element will be replaced completely with a notification box that the content was there before but no longer is.
However, we have more special rules under the actions object; I can live with shared content from nyheteridag, so if we happens to trigger on that site, the plugin will fall back to a notification. If we for some reason will trigger on the samnytt-link, that element will not show up at all, not even with a notification.
See below to see the plugin in effect! Note: The screendumps below does not match the configuration above.
The current release has gong through very basic testings with Facebook as ground base. However, it is time to move forward. Next step in the codebase is to make a configurable interface, categorized in a user friendly setup, so we can move further beyond the “one platform only”-world.
Basically, this is a completely API-less release, so the first setup will be built on JSON objects, which will be shareable. The first experimental json block can look like below and will be the output from a future API request too. The content will be closely described on the docpages (link above).
The hardest thing currently known to me is to keep up the motivation in a universe where time is not always enough. However, the project actually runs forward. The first outcome of a non-adopted codebase (nope, I did not adopt old code this time) can be seen below.
The words that is used in this version are censored due to “word trigger sensitivity”. By means, they are probably a trigger for some people. Probably some right wingers.
There’s no API ready for sharing and saving data for blocking. But I need to figure out some more things before anyway. One thing is how configurable the extension should be. Since this plugin is planned to be site-independent, the above Facebook-example is only the first step. Besides, I have some kind of idea to make simple json-imports, just so it could be completely API-less too. Or some kind of “I’ll post my json data here in this forum, feel free to use my filtering rules”. That could probably give a feeling of decentralization. By means, there should be no API that could be shut down or ddosed by angry users.
DOMSubtreeModified is deprecated, so the extension is primarily running with the MutationObserver. There’s however a failover setting in the configuration that allows us to use the DOMSubtree instead. DOMSubtree was the prior method to make sure elements are always analyzed, even after the window.load-segment. There’s always ajaxes that should probably be included in scans, as long as they are making visual changes in the browser.
Making it happen
Currently, this script loops through a pre-defined wordlist. For each element found on the site, the plugin checks if there are any sub-elements within the primaries scanned – which comes from either DOMSubtreeModified or a MutationObserver – that contains URL elements. URL elements are, if found, scanned for the badwords listed in the sample variable.
The next step in this script would probably to make the scanning level configurable too. For example, the current version is depending on that – after a found URL – there are a parent element with the class userContentWrapper assigned. When we trigger on this, we choose to replace the element with a text, instead of removing it. This part should however be configurable by users, probably with something like this:
Keep scanning elements on every site this plugin is active on.
Let user configure which element to look for, if it contains a .class or a #id.
When the .class or #id is found, X levels back, decide what to do (replace or remove the child) and from what level it should happen.
The current examples and snippets
Each element on Facebook are considered a kind of “card” element. By means, the card is the user post container. Removing the whole card will also remove everything linked to the post without leaving traces from borders, etc. From there, it can also be replaced with text or information.
Using userContentWrapper (Facebook) this is doable. The discovered “card node” should jump back to its parent and work from there (this is currently fixed with jQuery). Below, there’s an example of such cards. Facebook initialization always start with those, emptied.
We should however not stop there. I need to check if it’s possible to acutally remove the LINK element only, so that post data will stay there while the traces to the link will be removed. Also, currently posts are removed even when there are comments with “bad links”. This has to be limited. That is however a completely different chapter and should be configured at a user defined level. Why? To make them responsible for their own actions probably.