Real life environment issue requires actions

It was not long ago, I planned to start working with a new API. There’s currently a version 3 out, that handles most DNSBL services and some other services. However, since I am a huge Marvel fan I’ve realized that not all sites that I monitor actually runs a proper RSS flow.

FnargBlog once made a RSS-scraping tool that fetched a bunch of flows and the matched the content – to monitor content changes. The scraping was moved to Tornevall Networks, but the source quickly went outdated.

Now, as we wait for big things to happen in the Marvel Cinematic Universe since the Disney investor meeting presented the plans for 2021, I realize that there are some sites that I try to monitor that lacks RSS feeds. I can monitor a log of RSS-feeds – amongst them Google, but I stil miss important twitter flows and such.

That said, I guess I have to create a new RSS-scraper ASAP, with support for Twitter. However, this time I need to create the output rss feed myself aswell. This will hopefully in short be able to implement in API 4.0 – during a period when testing (and since there are very little user authentication mechanism in place), the feed might work as is for free.

To be continued…

Posted in Uncategorized | 1 Comment

What’s done in spare time (segment catching experiment)

This is a first live example of how to download file segments (for example video files) from a single playlist (like a m3u-manifest) by only using netcurl as a download library, and what sometimes happens on spare time.

The project can be watched at https://github.com/Tornevall/netcurl-segment-catcher or https://bitbucket.tornevall.net/users/tornevall/repos/mpd-netcurl/browse.

So what is this?

Basically, it is an example of what people can’t explain. Once upon a time, I got curious on how playlist manifests was built and how they could be downloaded and merged into one file. The most common way to do this with for example a shell script was to simply use curl from the command line:

curl -s <url> >manifest.file
cat -s manifest.file |grep -v ^#| awk '{system("curl -sS <extraUrlData>"$1 " >>merge.file")}'    

However, I quickly realized that it wan’t enough, since some playlists was delivered with multiple segments. By just downloading everything into same joined filed could simply destroy the content, or disrupt it. On this journey, I realized that I actually could use the tornelib-php-netcurl library to do the dirty work for me. So I wrote this project, to see whether it worked or now. It better, since it has been widely used in various ecommerce project, where reachability is the primary key to success. However, this was about binary files, so the expectations with netcurl project was quite high. If it, in reality, can’t handle binaries I can just throw it to the garbage.

So here we are with the first successful live example of how to use a binary safe downloader to fetch multiple segments from a playlist. The linked project has a README-file that explains more.

Posted in Uncategorized | Leave a comment

PHP 8.0 is now delivered with apt repositories

Last time I checked, PHP 8.0 RC1 had to be manually compiled so tests could join the suite. But I just discovered that PHP 8.0 is now delivered with the “ondrej PPA“. This also means that PHP 8.0 is no longer required to be manually compiled.

Posted in Uncategorized | Leave a comment

netcurl 6.1.1 is imminent and ready for PHP8

PHP 8.0 alpha3 was released 23 july and netcurl has been tested together with this release. It turns out that a few changes has been made in the core of curl that enforced a minor patch in netcurl for it to work properly again. In short, it is about how curl is presenting itself in v8.0, which is no longer as a resource. It instead returns objects on curl_init, generating a CurlHandler, CurlMultiHandler, etc.

You can read about the change at https://github.com/php/php-src/blob/php-8.0.0alpha3/UPGRADING while waiting for the upgrade.

The Bamboo test suite now also includes PHP 8 tests, starting with PHP 8.0alpha3.

Posted in Uncategorized | Tagged , , | Leave a comment

database driver (tornelib-php-database 6.1.0) is now up to date

It has been planned for a long time now, and the sprint was set to be finished next year. But the project was smaller than I thought so the database connector was released yesterday. The purpose is not – again – to reinvent the wheel. It is used to autodetect best available driver on platforms it is being installed on. Instead of being forced to adapt, this driver is as netcurl; adapting to the platform. However, this version is limited to PHP 5.6 and above.

The packagist codebase has also, like many of the other packages used by composer via packagist, been moved to github to be better available on service distruptions that seems to be quite common on a bitbucket cloud. The codebase is however still self hosted at bitbucket.tornevall.net – so the github collection is just a mirror.

By all means, this also takes us a bit closer to the entire library/API we’re targeting.

https://github.com/Tornevall/tornelib-php-database

Posted in Uncategorized | Leave a comment

netcurl 6.1.0 just released

8 days left in the planned sprint for release (and this was just a random set date at first).

netcurl 6.1.0 just released anyway! But the package has been up and tested for a few weeks already and sould be entirely backward-compatible with the 6.0-branch.

However, to be sure that the compatibility breaks are prevented (if you still uses old classes and calls anywhere), you should consider including tornevall/tornelib-php-netcurl-deprecate-60 into your installation packages as the choice has been made to not include the old name standards at all in the new release.

What’s new?

The entire structure actually. All drivers are separated into their own PSR-4 based modules and can be called separately. To use the old way to auto discover available drivers, MODULE_CURL has been replaced with NetWrapper, so that’s what you’re looking for in that case. If you are totally sure that you want to use specific drivers you can choose to do so.

Another thing that is new, is the way how you configure the drivers. There are a bunch of defaults that can be used by setting up a new WrapperConfig. This will be covered in the documentation soon. Also, the curl driver has been upgraded to support multiple calls in the same request (curl_multi). This has not been possible before. Also, mentioned below, the SOAP support has been upgraded. Amongst many features, WSDL-requests can now be done with or without cache (and keep the ability to detect authentication failures in both states).

With the above SOAP support, it was also decided to give WrapperConfig to set production/staging mode, where – if you initialize wrappers in staging mode – cached requests will be the default state for production requests. This will speed up things a bit.

Netcurl has also been divided into two separate packages, where the old network module is it’s own package as of version 6.1.0 – to make sure both packages are included in your projects it’s sometimes (to avoid redundant inclusions with composer) to just load tornelib-php-network as netcurl is included as a requirement in that package.

Most of the old drivers are from 6.1.0 removed, except for curl, streams and SOAP. As stream-drivers are included by default (with the binary-ready file_get_contents) I realize that nothing else are actually required. However, drivers can be built manually by the register functions if they are really necessary. In this release, however, an extra RSS parser are included that will use SimpleXML as long as Laminas RSS-parser isn’t included in the package.

Any questions or support related requests can be sent via the e-mail list or via support@tornevall.net

Good luck!

For full release notes, you can take a look at the list below.

    * [NETCURL-225] - NetCURL 6.1.0 EPIC
    * [NETCURL-226] - PSR4 NetCURL+Network (Phase 1)
    * [NETCURL-230] - WordPress driver in prior netcurl is lacking authentication mechanisms
    * [NETCURL-246] - Pipeline errors for PHP 7.3-7.4
    * [NETCURL-272] - getSoapEmbeddedRequest() - PHP 5.6+PHP 7.0
    * [NETCURL-274] - Cached wsdl requests and unauthorized exceptions
    * [NETCURL-200] - Confirm (by driver) that a driver is really available (update interface with requirements)
    * [NETCURL-209] - Support stream, when curl is not an option
    * [NETCURL-227] - Migrate: getHttpHost()
    * [NETCURL-231] - Move out MODULE_NETWORK to own repo
    * [NETCURL-232] - Make exceptions global
    * [NETCURL-234] - Support immediate inclusions of network libraries
    * [NETCURL-238] - Make setTimeout support ms (curlopt_timeout_ms)
    * [NETCURL-242] - Disengage from constructor usage
    * [NETCURL-243] - Reimport SSL helper
    * [NETCURL-244] - Reimport curl module
    * [NETCURL-245] - Reimport soapclient
    * [NETCURL-247] - The way we set user agent in the SSL module must be able to set in parents
    * [NETCURL-249] - setAuth for curl
    * [NETCURL-251] - setAuth for soap
    * [NETCURL-255] - Add static list of browsers for user-agent 
    * [NETCURL-258] - NetWrapper MultiRequest
    * [NETCURL-259] - High focus on curl (rebuild from 6.0)
    * [NETCURL-260] - Current curl implementation is only using GET and no advantage of Config
    * [NETCURL-261] - Make sure setAuthentication is a required standard in the wrapper interface
    * [NETCURL-263] - Add errorhandler for multicurl
    * [NETCURL-264] - Avoid static constants inside core functions
    * [NETCURL-265] - getCurlException($curlHandle, $httpCode) - httpCode is unused. Throw on >400
    * [NETCURL-267] - On http head errors (>400) and non empty bodies
    * [NETCURL-268] - Add Timeouts
    * [NETCURL-269] - Proxy support for stream_context
    * [NETCURL-271] - Synchronize with netcurl 6.0 test suites
    * [NETCURL-273] - Support driverless environment
    * [NETCURL-276] - Use a natural soapcall with call_user_func_array
    * [NETCURL-277] - Netwrapper Compatibility Service
    * [NETCURL-278] - setChain in 6.1 should throw errors when requested true
    * [NETCURL-279] - Make sure setOption is useful in NetWrapper and MODULE_CURL or work has been useless
    * [NETCURL-280] - proxy support for curlwrapper and wrappers that is not stream wrappers
    * [NETCURL-283] - Use setSignature (?) to make requesting clients set internal clientname/version as userAgent automatically instead of Mozilla
    * [NETCURL-285] - Reinstate Environment but in ConfigWrapper to make wsdl transfers go non-cache vs cache, etc
    * [NETCURL-286] - Move driver handler into own class
    * [NETCURL-287] - Initialize simplified streamSupport
    * [NETCURL-288] - Output support for XML in simpler wrappers
    * [NETCURL-289] - Open for third party identification rather than standard browser agent
    * [NETCURL-291] - SoapClient must be reinitialized each time it is called
    * [NETCURL-292] - Support basic rss+xml via GenericParser
    * [NETCURL-293] - Try fix proper rss parsing without garbage
    * [NETCURL-294] - Make it possible to initialize an empty curlwrapper (without url)
    * [NETCURL-295] - MultiNetwrapper (+Soap)

Posted in Uncategorized | Leave a comment

A discussion group is missing an administrator – What can I do?

When Facebook groups loses their administrators/moderators, the group will also suggest a new promotion for the group. Either you could choose yourself, or someone that you think could be “the perfect administrator”.

This is normally done with a notice in the right column on the screen (unless you’re not in the new design). By means, promote a new administrator can only be done if the entire group are missing a “team”. Tornevall Networks built a plugin that automates this promotion when a group is entered via Chrome.

It also, partially, can do this via Facebook group bookmarks. That part of the plugin isn’t perfect as it could crash if your group list is too large. Not due to the large list, as it opens one group at a time. This is still under bugchecking. However, the plugin has best effect on often-visited groups. You can find it in the link below.

Currently, there is a work in progress with the new Facebook design. We will come back to that later. And by the way, the plugin is “multilingual supported”. Groups can be promoted regardless of which language you’ve chosen and in a few different ways. As you might not want to promote yourself as an administrator in racist-groups, there’s a switch that allows you to accept before promotion.

https://chrome.google.com/webstore/detail/tornevalls-administrator/ojmdmooibphpmidehiiglejemhgbbfel

Posted in Uncategorized | Tagged , , , , | Leave a comment

Back to basics with netcurl 6.1

Many of netcurl 6.1 components have been released this week. Dependencies that “we” use to make netcurl more efficient. Many things has been imported from 6.0, but instead of copy local code from the old project (with pride) all code has been rewritten from scratch. Having old code in a new project could probably be devastating.

So. The wheel has been built again?

No. Yes. Maybe. Nah. We’re just making the wheel better!

This “wheel” is a self contained project. As usual. I’ve said it before. The project is being built to be a simplified component between an idiot developer and the internet communcation layers. For example, by building new solutions a developer normally needs to reinitialize the tools he/she (hen) is using. If the developer is using curl, it has to be configured from scratch. And there are probably other solutons that do similar stuff to this module. But I don’t want it.

The primary reason of this is the fact that I need more. I need a parser that fixes all communications by itself. Without my interference. I say: “Bring me this website parsed” and the solution should bring it to me, regardless if it is SOAP, rest, XML, rss or a socket connection. Netcurl is the part that should decide this for me, and I should be ready on the other side to have it. And this is actually how curl itself works.

Compared to the military, this application is not supposed to ask “HOW HIGH?!” when someone asks “Jump!”. It should know how high it should jump before developers has even measured that value.

However, if the site speaks SOAP, you need an extra component. And probably, SoapClient resides in such solutions. Netcurl should handle this out of the box. Or xml. Or RSS. So here we are again.

netcurl 6.1 is being rebuilt, and you’d probably understand why if you look at the old codebase. It was supposed to transform into a PSR4 solution. But it failed. So there’s a lot of reasons of why this wheel is being rebuilt. And this time, I think I got it faster. For example – netcurl 6.0 (or TorneLIB 6.0) did not support anything but regular curl communications. If you wanted to fetch five different pages, you had to run the curl-request five times. netcurl 6.1 takes advantage of curl_multi calls, so you can push five different urls into same session and get them back from different handlers. Like it should be.

So what does back to basics mean?

Well. netcurl, or the prior name TorneLIB was actually built with war in mind. It was supposed to datascrape proxy lists, collect the proxies and then register them for use in DNSBL. In short, blacklist trolling haters on the internet. Then, work came in my way and I found out that netcurl had a perfect role in ecommerce project we did build (yes, we). As it supported both rest and soap in an environment where the library itself chose an available driver, it could handle much more than just warfare applications.

Time passed by and for a few weeks ago, corona said hellow. At this moment, most of us stayed home, so when the “real work” ended I could with no interruptions take care of netcurl. And here we are, almost done. And this time, the library should be able to handle it better than it used to.

Posted in Uncategorized | Leave a comment

The long time netcurl 6.1 soon ready for release

A mailinglist for netcurl has been established under lists.tornevall.net, available for subscriptions where information like this will be posted.

Stable releases for netcurl v6.0 will, for now on, no longer be pushed into the master repository. A new stable/6.0 has been created for maintenance releases.

Version 6.1 is quite close to get a new tag and there’s not many compatibility issues left to take care of. The only bigger part that awaits for 6.1 is the complementary network module.

There’s a few other components that rests in netcurl 6.1, that is under consideration right now if they should wait for completion or follow a first primary release. I’ll be back on this.

Posted in Uncategorized | Leave a comment

NetCurl is in active development

This old project that was once born as a proxy scraping tool is alive again. Well, in fact it has been alive and idle in several years as the purpose of it did a big road change when I started to enforce implementation of it in ecommerce platform. The project showed up to be a great combo-mixer of communication tools since it had great failover possibilites. However, time changes and it needs more than this now.

My wish is to reinstate a proxy scraper, as this project was written in the early years of PHP 5.3 – and that tells a lot of what it was and what it now can become instead. As you can see on the left side, you could possibly figure out that the support of failovers are growing. This of course takes time to implement but as I just wrote, I believe that this must be done.

But that is not everything. By making the client more compliant with reality it could also be a part of another projects – like the network tools, as those tools won’t be much to have if there’s no data scraping available. For example, there was the “fnarg project” (more known as a part of the giraffe-project today, for a smaller amount of people), which was specialized on RSS-fetching. The fetcher was built so it not only fetched new articles. It also kept track of old, and if they was changed/edited over time.

All of this have forced me into a state that I’ve refused to be in, for several years now. But realizing that PHP goes forward, and not much backwards – this must be done before it’s too late.

Welcome to version 6.1

Posted in Uncategorized | Leave a comment