News archive

SoFiA Production Update 5
Fixed a permission issue with the VM image on the server (sorry!).
Added another two sets of workunits to the queue, although we still have a fair amount of work remaining as it is, so that's a good sign.
6 Dec 2017, 1:59:27 UTC · Discuss


SoFiA Production Update 4
Updated the windows 32bit and 64bit virtualbox wrappers to version 26200.

Pushing out 4 sets of work units this time instead of 3 to try and keep work flowing nicely.

I'm still trying to investigate the potential cause of invalid workunits. I'm currently testing workunits on several different virtualbox versions in-case one version in particular is causing issues. I'm also going to test the VM with virtualbox 5.2, just to confirm it works properly. We're still waiting on 5.2 compatible wrappers from BOINC, though.
29 Nov 2017, 4:46:11 UTC · Discuss


SoFiA Production Update 3
Spent some time improving the speed of the assimilator to grind through the large assimilation backlog we currently have.

Currently processing another 3 sets of workunits. They should be ready to go in a few hours.
22 Nov 2017, 4:30:00 UTC · Discuss


SoFiA Production Update 2
The work for this week is broken up in to 4 different runs (16 -> 19) which each contain a quater of the 486 parameters that we're running through SoFiA.

The aim here is to reduce the run time of each work unit, and reduce the amount of data that needs to be re-calculated whenever a workunit is marked invalid.

I'm still investigating the actual cause of these invalid workunits, but I still haven't been able to re-produce the errors locally on my own machines. In the mean time, I hope this quick fix eases the load a little.
15 Nov 2017, 6:43:10 UTC · Discuss


SoFiA Production Update 1
Added the ability to scale credit awarded to workunits by the validator. Currently credit scaling is at x1.5 to make up for the validation errors we've been having.

Currently processing a new batch of work units, which should be done within the next 12 or so hours.
8 Nov 2017, 6:54:13 UTC · Discuss


SoFia Production Workunits
Greetings Everyone!

I'm pleased to announce that I'm starting production workunits for SoFiA. After the removal of the Wm50 parameter from SoFiA's parameter files, the previous determinism issues disappeared, so I'm ready to begin pushing out production work units.

These work units are the exact same as the Duchamp ones, around 15MB in size. This time, however, we're running 486 different parameter combinations, which is up significantly from Duchamp's 176 parameter combinations. You can expect longer WU times, with a larger amount of credit per WU.

The project scientists are also in the process of re-chopping the DINGO cube in to larger, 100MB pieces that we'll also be running through SoFiA. When these cubelets are available, I will be splitting them in to a set of 5 different runs, with a different parameter set for each run. The first 4 runs will contain 100 of the 486 SoFiA parameters, with the final run making up the remaining 86 parameters.
I'll keep you updated as to when these workunits will be available.

I must also recommend that everyone does NOT update to VirtualBox 5.2 for the time being. I'm currently waiting on BOINC to officially release their VirtualBox 5.2 compatible wrapper versions.

Finally, I'm going to attempt to make time to check the Sourcefinder message boards every day, instead of only on Wednesdays when I'm actually working on Sourcefinder tasks. Hopefully this'll help me be significantly more responsive to any Sourcefinder related issues that crop up over the coming months.

Thank you everyone for being patient with the rather slow progress of the SoFiA beta. I hope to have a fairly steady flow of production work units.

Happy Crunching!

Sam.
1 Nov 2017, 2:43:17 UTC · Discuss


SoFiA Beta Update 6
Apologies, I forgot to post my changelog for yesterday!

There was an issue with workunit name clashes between duchamp and sofia. The original naming convention was "runid_cubename", but there was no differentiation between SoFiA and Duchamp workunits.
The new naming convention is simply "appname_runid_cubename" to avoid the possibility of future clashes.

I've also contacted the SoFiA devs about the determinism issue I outlined last week and they're looking in to it.

I want to get started on real work as soon as possible, so if this week's work goes smoothly, I'll see about moving on to production work next week.
19 Oct 2017, 0:26:12 UTC · Discuss


SoFiA Beta Update 5
Reduced logging in the assimilator logs so I can actually find things in them.

Fixed an extremely unlikely bug in the assimilator that fires if amazon's S3 servers can't be reached.
Fixed a bug with the assimilator attempting to remove temporary folders that don't exist.

It also appears as though one of the SoFiA result parameters, Wm50, is non-deterministic. Some runs calculate it without error, but others fail. Here's an example error log from two different runs of the same parameter set:

Result 1 (courtesy of emoga):

Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.
Warning (Parametrization): Kinematic major axis derived from just 2 data points.
Error (Parametrization): Calculation of Wm50 failed (3).
Warning (Parametrization): Failed to measure source line width.
Error (Parametrization): Cannot fit ellipse, source flux <= 0.
Warning (Parametrization): Ellipse fit failed.
Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.
Result 2 (courtesy of LCB001)
Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.
Warning (Parametrization): Kinematic major axis derived from just 2 data points.
Error (Parametrization): Cannot fit ellipse, source flux <= 0.
Warning (Parametrization): Ellipse fit failed.
Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.

Note how in the first case, the Wm50 parameter fails to calculate, but in the second there's no issue calculating it.
This is something I'll bring up with the SoFiA team, as it might be an issue somewhere that they can address.

I've also received a set of parameter files that the scientists would like us to run, so for this week we'll be running a test on those parameters to make sure everything works fine.
There are a total of 486 parameters to run per cubelet (that's 2x more than duchamp!) so work units should take a bit longer to complete than normal. I tested the parameters locally and they took around 30 mins to process one work unit.
11 Oct 2017, 2:54:41 UTC · Discuss

SoFiA Beta Update 4
Fix yet another bug in the assimilator.

Modified the way SoFiA returns output files. There was an issue with various parameter combinations causing SoFiA to return outputs in different formats. This should be accounted for properly now.

Built a new, extended parameter set for SoFiA testing. We're now testing about 100 different combinations.

Pushed out new work units using the extended parameter set.
27 Sep 2017, 7:17:38 UTC · Discuss


SoFiA Beta Update 3
Fixed a few bugs in the assimilator.
Improved logging in the validator.

Out of the 1736 work units sent out in the last update, I have the results for 1723 of them stored in our cloud storage system. That's a 99% success rating!

There were a couple of issues with the assimilator that were preventing the results from being properly stored in our database though. We store sources found in our database as well as storing the raw files returned by a client in cloud storage.

Hopefully this will be the last week of the SoFiA beta, and then we can proceed in to full processing mode!

Edit: I've also updated the project description that appears on the main page to include SoFiA.
20 Sep 2017, 1:26:26 UTC · Discuss


SoFiA Beta Update 2
Increased VM RAM size to 256MB. This'll hopefully stop that furious swapping that was happening on Linux.

Compacted the VM image from 1.9GB down to 1.6GB. The compressed image is 446MB. I'll see if theres anything more I can strip out of the VM at a later date. I might be able to strip it down by another few hundred MB.

Released 3x the number of test work units as before. There should be around 1500 ready to go.
13 Sep 2017, 4:51:07 UTC · Discuss


SoFiA Beta Update 1
Removed CNHI Filter from all test parameters. This was causing huge workunit runtimes, ultimately resulting in work units timing out.

Added a per-parameter timeout of 30 mins. The VM will now attempt to process each parameter set for a max of 30 mins before timing out. In my local tests, most parameters only take around 2 - 5 mins to run, so this should be more than enough time. I'll keep an eye on whether a 30 min timeout becomes an issue in the future.

Dropped VM RAM size to 128MB from 1024MB. This should significantly reduce the memory requirements of the host system.

Updated vboxwrappers for apple (x86 and x64) to version 26199 from version 26196


A new set of test work units are available now.
6 Sep 2017, 3:07:54 UTC · Discuss


SoFiA Test Work Units
As of today, my main work on the SoFiA is complete.
The final step is testing to ensure the validator, assimilator etc. all function correctly.
For this, I've created a beta test app for SoFiA. I'll be running test work units through this app until I've ironed out all significant issues.

If you wish to help me run test work units, you can set your project preferences to allow beta test apps on the following page:
https://sourcefinder.theskynet.org/duchamp/prefs.php?subset=project

I'm hoping that, overall, the testing phase of SoFiA wont be too long and we'll have a production SoFiA app running within the next few weeks.
30 Aug 2017, 5:00:14 UTC · Discuss


The future of Sourcefinder
As some of you will remember from my previous post, Sourcefinder is going to see some significant changes coming in the next few months. I thought it was time I properly outlined exactly what's happening.

The introduction of SoFiA
I'm currently working on integrating the SoFiA sourcefinding application in to Sourcefinder. In order to integrate SoFiA cleanly, I'm working on a fairly significant overhaul to a lot of the Sourcefinder backend systems that will allow support for multiple sourcefinding applications. My aim here is to make it as easy as possible to add new sourcefinding applications to the system in the future. If anyone is interested, you can see the changes I'm making in the module_rework branch of our git repository.
I'll most probably be sending out quite a few test work units while working on integrating SoFiA, so you'll probably get odd spats of work until it's integrated properly.
Once SoFiA is working correctly, we'll be processing all of the work units in the simulated cube again, but this time using SoFiA instead of duchamp.

SoFiA vs Duchamp research paper
The scientists who will be using the data from this project plan on writing a research paper comparing the performance of Duchamp and SoFiA as sourcefinders. From what I've been told, the data analysis side of this project is most likely to be performed by an ICRAR studentship student either at the end of this year or the end of next year.
I plan on ensuring that as many people as possible who contributed to Sourcefinder will have their names/usernames listed in the research paper before it's published.
You'll hear more about this paper in a few months once SoFiA is integrated properly in to Sourcefinder.

Real data from ASKAP
As I stated in the previous post, we should have some real data from ASKAP to process on Sourcefinder in the coming months. The moment this data becomes available to me, I'll be sending out work units for Duchamp, and later for SoFiA. I don't have a timeframe on when this data will be available aside from "soon", but I'm hoping we'll see it within a few months.

Visualisation of Sourcefinder results
I plan on developing a little web applet that will probably live on http://www.theskynet.org to allow anyone to view the sources found by Duchamp and SoFiA. My current plan for this applet is to display an image of the cube slice that the source was found in, a small highlight indicating the source, and a list of the users who contributed to finding the source.
I'll be starting work on this applet after SoFiA is integrated in to Sourcefinder.

Workunits
The workunits that were lost in the back end storage issue that I spoke about in the last post has all been reprocessed (thank you!). This means there wont be a significant number of workunits for Sourcefinder for a little while. I'll try to make this period as short as possible (hopefully a month or two at most), but it really depends on how easy it is to integrate SoFiA.

Project URL change
At some point I plan on changing the project URL from https://sourcefinder.theskynet.org/duchamp to https://sourcefinder.theskynet.org/sourcefinder. The original name 'duchamp' was a carry over from before I inherited this project. I didn't think we'd be running multiple applications, so I just left it. Obviously once SoFiA is working, the 'duchamp' part of the URL wont make much sense, so I'll be changing it to the more generic 'sourcefinder'. I'll give everyone a weeks notice before I change anything, so you should have time to change over easily. I'll also ensure the old URL still works, but simply re-directs to the new one.

Gridcoin whitelist
There's currently a poll up for adding Sourcefinder to the gridcoin whitelist. If you're interested in voting yes or no, please check out the post Erkan made about it.

I think that's about everything I have for now. I'll try to keep everyone as updated as possible on all of these issues.
Thank you again for helping out with Sourcefinder!

Edit: Additional Information as of 2nd August, 2017
The ASKAP data is still a work in progress, and I've been given an ETA of "before the end of this year". Data measurements on ASKAP have been taken at different rotations of the Earth, and so need to be Doppler corrected to be stacked in to a cube appropriately. This process is still being worked on, but they expect to make significant progress on finalising it in September.

SoFiA work units will have to be around 100mb as opposed to the 10mb of Duchamp work units. I've been told that this is because SoFiA requires a larger cube to develop a source reliability estimate. The 10mb cubes that Duchamp used simply aren't large enough to develop a meaningful reliability measure.
In order to not reduce the number of work units by a factor of 10, I plan on releasing the same cube multiple times with a different parameter set for each work unit.
Originally with Duchamp, each cube was released as one work unit with 176 different parameters to run on that cube.
With SoFiA, each cube will be released in multiple work units, with a smaller number of parameters per cube.
Ultimately, this will result in a set of larger, and slightly longer work units than Duchamp.
19 Jul 2017, 0:14:08 UTC · Discuss


Real data coming soon!
I've just arrived back from a meeting with Kirsten, and I have some extremely good news to share with everyone!

ASKAP has almost completed observations of its first ~500GB cube, and we're going to be processing it when it's done!
Running a sourcefinding application on ASKAP data hasn't been done yet, so this is going to be a world first.

While we're waiting for ASKAP to be done, we're going to be running the simulated super cube again, this time using a new sourcefinding application called SoFiA.
We're going to be looking at whether SoFiA is any better at finding sources than Duchamp is.

Once we have data from both Duchamp and SoFiA, I'll be working on cross matching the sources they found with the original sources catalogue, then we'll have a clear picture of which is the better sourcefinder.

I'm also told that there's a very high chance of a paper being written on the comparison between Duchamp and SoFiA as sourcefinders, so I'll be sure to keep you all up to date if I hear any more about it.

Finally, there's been a bit of a back end storage issue with some of the data we've processed, and I'll need to re-run a few sets of work units, so expect some more work to be coming fairly soon.

Edit: Please see my other post that expands on some of the things I've discussed in this one.
21 Jun 2017, 4:52:54 UTC · Discuss


681GB Down
Congratulations everyone, we made it through the first 681GB of Sourcefinder cubelets!
You've all done amazing work so far, and everyone at theSkyNet thanks you for it.

Now that we've processed a large portion of the simulated cube data, I'm going to be focusing on developing some visualisation tools to allow everyone to actually see the sources they've processed. This tool is going to be incorporated in to theSkyNet.org.
You're basically going to be able to select a cubelet and parameter set to view, and you'll be shown the locations of the sources within that cubelet along with the name of the users who found the sources.
I'm also planning on adding credit stats from Sourcefinder to your dashboard on theSkyNet.org at some point.

To my knowledge, there should also be another ~300GB of data somewhere to process, as the original supercube was around 1TB in size. I believe this data wasn't extracted from the supercube for whatever reason, so I'm going to be working out how to extract it.
Until then, there wont be any more Sourcefinder work units unfortunately.

Anyway, thank you again to all of you for crunching all of this data. I hope you all enjoy the visualisation tool when I've finished building it :)
14 Jun 2017, 0:15:54 UTC · Discuss


Update 5 June 2017
The set of work units I'm pushing out this week are the last of our 681GB batch of data.
To my knowledge, there is still more data to process once this is done, but I'll need to find out how to extract it from the main supercube.
All of the data up until now had been pre-extracted ahead of time for me.

Here are our stats for this week:

Total Cubes: 40269
Total Results: 28105
Total Canonical Results: 12197. 30.2888077678%
Average Results Per Cube: 0.697931411259

Good Results: 26197. 93.2111723893%
Bad Results: 1908. 6.78882761075%

Client Bad: 1906. 6.78171143925%
Client InProgress: 2. 0.00711617149973%
Client Good: 26197. 93.2111723893%

Server Inactive: 0. 0.0%
Server Unsent: 0. 0.0%
Server InProgress: 581. 2.06724782067%
Server Over: 27524. 97.9327521793%

7 Jun 2017, 5:07:39 UTC · Discuss

Update 31 May 2017
Not much this week, just a new set of work units.

Here are this week's stats:

Total Cubes: 39928
Total Results: 38293
Total Canonical Results: 11109. 27.8225806452%
Average Results Per Cube: 0.959051292326

Good Results: 35874. 93.6829185491%
Bad Results: 2419. 6.31708145092%

Client Bad: 2405. 6.28052124409%
Client InProgress: 14. 0.0365602068263%
Client Good: 35874. 93.6829185491%

Server Inactive: 0. 0.0%
Server Unsent: 9512. 24.8400490951%
Server InProgress: 1777. 4.6405348236%
Server Over: 27004. 70.5194160813%

31 May 2017, 5:46:54 UTC · Discuss

Update 24 May 2017
Modified text on the create account page to clarify that the user will require BOINC when signing up to the project.

Updated text on the front page, as we're not a beta project any longer.

Updated forum breadcrumb trail to always have a link back to the main Message Boards.

Pushed out another batch of work units. I'm also transferring some reserve units in case we run out again.

Here are the stats for this week:

Total Cubes: 31248
Total Results: 26217
Total Canonical Results: 8921. 28.5490271377%
Average Results Per Cube: 0.838997695853

Good Results: 24264. 92.5506350841%
Bad Results: 1953. 7.44936491589%

Client Bad: 1907. 7.27390624404%
Client InProgress: 46. 0.175458671854%
Client Good: 24264. 92.5506350841%

Server Inactive: 0. 0.0%
Server Unsent: 4705. 17.9463706755%
Server InProgress: 654. 2.49456459549%
Server Over: 20858. 79.559064729%

24 May 2017, 6:06:53 UTC · Discuss

Update 17 May 2017
Finally got SSL up and running for the project. You can all use https now!

Moved the project URL over to https://sourcefinder.theskynet.org/duchamp. Please see the other announcement post concerning this.

Whipped up a logo for the main project page:

Pushed out a new set of work units.

Here are our run statistics for last week:

Total Cubes: 27776
Total Results: 32966
Total Canonical Results: 9373. 33.7449596774%
Average Results Per Cube: 1.18685195853

Good Results: 30016. 91.0513862768%
Bad Results: 2950. 8.94861372323%

Client Bad: 2771. 8.40563004307%
Client InProgress: 178. 0.539950251775%
Client Good: 30017. 91.0544197052%

Server Inactive: 0. 0.0%
Server Unsent: 8823. 26.7639386034%
Server InProgress: 955. 2.89692410362%
Server Over: 23188. 70.339137293%

17 May 2017, 6:28:02 UTC · Discuss

New Project URL
Now that SSL is correctly set up for Sourcefinder, I've changed the project URL to https://sourcefinder.theskynet.org/duchamp/.
The only thing you'll need to do is remove the duchamp project from BOINC, then re-add it using the new URL.
The old URL is still valid for work unit uploads/downloads, so all current work units that are registered using this URL should still complete and upload fine.
17 May 2017, 1:46:49 UTC · Discuss


Update 10 May 2017
Dropped value from 8 to 1. This should, according to the boinc wiki, limit the number of work units sent per RPC cycle to 1 per CPU.
This is being done to hopefully reduce performance issues some users were experiencing where Boinc would try to simultaneously run a large number of work units straight off the bat.

We have a new domain name too! http://sourcefinder.theskynet.org. I plan on switching the project URL from the current raw IP address to the proper URL next Wednesday. Everyone will receive a message from Boinc asking them to re-attach to the project, just a heads up!

The stats for this week are as follows:

Total Cubes: 24304
Total Results: 31683
Total Canonical Results: 11615. 47.7904871626%
Average Results Per Cube: 1.30361257406

Good Results: 28717. 90.6385127671%
Bad Results: 2966. 9.3614872329%

Client Bad: 2764. 8.72392134583%
Client InProgress: 201. 0.634409620301%
Client Good: 28718. 90.6416690339%

Server Inactive: 0. 0.0%
Server Unsent: 3431. 10.8291512799%
Server InProgress: 874. 2.75857715494%
Server Over: 27378. 86.4122715652%
Still sitting above 90% on the good results. Everything is going fairly smoothly from what I can see :)
10 May 2017, 3:45:43 UTC · Discuss

Update 3 May 2017
Well we've got a good amount of work sitting in the queue, still over 8000 work units are waiting to be processed.
I've got another 4 sets of work units sitting on stand by for when we start running low, so hopefully we wont have any supply issues any time soon!

Stats for this week:

Total Cubes: 19096
Total Results: 41783
Total Canonical Results: 12991. 68.0299539171%
Average Results Per Cube: 2.18804985337

Good Results: 38333. 91.7430533949%
Bad Results: 3450. 8.25694660508%

Client Bad: 3306. 7.91230883374%
Client InProgress: 144. 0.344637771342%
Client Good: 38333. 91.7430533949%

Server Inactive: 0. 0.0%
Server Unsent: 8859. 21.2024028911%
Server InProgress: 1634. 3.91068137759%
Server Over: 31290. 74.8869157313%

Still above 90% on the good results which is, uh, good!

Also, our old web developer has returned to theSkyNet on a casual basis, and is helping me out with some of the issues with theskynet.org. We'll also be getting a proper domain name for Sourcefinder soon (most likely something like sourcefinder.theskynet.org).
3 May 2017, 3:39:43 UTC · Discuss

Update 26 April 2017
Added a field to the main user page that shows your total CPU time and average CPU time per result across all hosts.

Pushed out another set of work units for this week. I'm currently in the process of packaging up some additional work units, as we ran out fairly fast last week. The extra work units are currently transferring over to the main server, so I'll push them out later this week (probably tomorrow).

The stats for this first week are:

Total Cubes: 10416
Total Results: 19199
Total Canonical Results: 6737. 64.6793394777%
Average Results Per Cube: 1.84322196621

Good Results: 17607. 91.7079014532%
Bad Results: 1592. 8.2920985468%

Client Bad: 1562. 8.13584040835%
Client InProgress: 30. 0.156258138445%
Client Good: 17607. 91.7079014532%

Server Inactive: 0. 0.0%
Server Unsent: 3491. 18.1832387103%
Server InProgress: 371. 1.9323923121%
Server Over: 15337. 79.8843689776%
91% good results so far, which is fairly decent. It seems likely to me that the additional percentage of bad results are likely due to the increase in the number of work units that are sent out to each host. We've had a few people posting on the forums about getting too many work units. Hopefully the percentage of bad work units will settle down over the course of the next few weeks as people adjust their project preferences. If anyone else has had an issue with too many work units on any of their hosts, please see this thread.

Thank you, everyone.
26 Apr 2017, 7:16:07 UTC · Discuss

We're now ready for Production!
Yup, the day has come. Sourcefinder is finally moving from testing to production!

Most of the bugs and issues we've been having during testing have been ironed out to a level that I'm happy with.
All testing work units have been cancelled, and all work units and results have been archived.
We're now going to begin to crunch actual data, 681GB of it!

27G cubelets_1
27G cubelets_2
27G cubelets_3
27G cubelets_4
27G cubelets_5
27G cubelets_6
27G cubelets_7
27G cubelets_8
27G cubelets_9
27G cubelets_10
27G cubelets_11
27G cubelets_12
27G cubelets_13
27G cubelets_14
27G cubelets_15
27G cubelets_16
27G cubelets_17
27G cubelets_18
27G cubelets_19
27G cubelets_20
27G cubelets_21
27G cubelets_22
27G cubelets_23
27G cubelets_24
27G cubelets_25
8.6G cubelets_26
681G total

We're starting with cubelet sets 1 to 4.

A few other notes:

I've added a new process to the server to archive work unit and result data older than 2 weeks, so you wont be able to see any of that archived data on the website.

Any credits earned during testing will remain during production.

All production work units will be registered under run 10.

The limit on the number of work units sent to a single host will be increased. I'm planning on modifying the max work unit configuration from:
<max_wus_to_send>1</max_wus_to_send>
<max_wus_in_progress>2</max_wus_in_progress>
<max_ncpus>4</max_ncpus>
to
<max_wus_to_send>8</max_wus_to_send>
<max_wus_in_progress>3</max_wus_in_progress>
<max_ncpus>16</max_ncpus>

This should mean up to 16 CPUS on each machine will be used, and each user will be able to have up to 3 work units per CPU at any one time. This means the maximum number of WUs that a client can be running at any one time is now 48; just make sure you can crunch them all in time!
Additionally, each scheduler request cycle will provide up to 8 work units.
These numbers are tentative as well, if there are any issues or recommendations, I'm more than happy to change them.

A few future things:

I plan on finally updating theSkyNet website to accurately reflect the current state of the Sourcefinder project.

I'll be reporting the project stats each week, for those who are interested.

I'll definitely be pushing work units out every Wednesday, and sometimes on other days of the week if I get time.

My main focus from now on is fixing the issues with theSkyNet website, as it's been running on life support for miles too long.


Thank you everyone who has helped out with the Sourcefinder project so far, we greatly appreciate the work you've all done. It's because of all of you that this project was able to eventually reach production :).
19 Apr 2017, 1:22:01 UTC · Discuss

Update 12 April 2017
I haven't heard much back from anyone concerning our Virtual Box problems, so I've decided that we'll just have to put up with the odd hitch along the way.
Most of the errors were fairly uncommon, so hopefully we'll be able to deal with them as they come.

As of next week (19th April) I plan on moving Sourcefinder from beta to proper production. We'll still use this same server, so there'll be no need to re-attach or swap to another project.

Next week be erasing all of the work units we've run so far, and cleaning up the database in preparation for the production runs. I'm going not going to modify anyone's credit though, you all deserve to keep that for the work you've done so far :)

I'm going to let the current set of work units run for the rest of this week and compare the results to those from a local run I've been doing. I'm expecting identical or very close to identical results.

So thank you everyone for your efforts so far, and I look forward to finally getting this project complete :)
12 Apr 2017, 1:32:15 UTC · Discuss


Changelog 5 April 2017
Fixed a small bug with the assimilator.

It turns out that the project has been disabled the whole week, so the db dump hasn't been running and neither has the assimilator. I'm just waiting for everything to get up to speed again, then I'll push out some new work units and provide this week's stats.

I've been posting some of our virtualbox issues to the Boinc forums as well as the boinc_dev mailing list. I'm still waiting on responses.
5 Apr 2017, 6:17:49 UTC · Discuss


Changelog 29 March 2017
As per the recommendations made in this thread, I've made a few changes to hopefully prevent the file deleter from erasing the parameters files.

Our weekly stats are as follows:

Total Cubes: 6944
Total Results: 14575
Total Canonical Results: 6707. 96.5869815668%
Average Results Per Cube: 2.0989343318

Good Results: 13895. 95.3344768439%
Bad Results: 680. 4.66552315609%

Client Bad: 678. 4.65180102916%
Client InProgress: 2. 0.0137221269297%
Client Good: 13895. 95.3344768439%

Server Inactive: 0. 0.0%
Server Unsent: 0. 0.0%
Server InProgress: 239. 1.6397941681%
Server Over: 14336. 98.3602058319%


We've managed to hit the less than 5% error margin for work units, Congratulations!

I should also note that we have very few invalid work units. Over the course of week, the new validator only picked up 4 invalid work units, that's a pretty good sign.

I'm still in the process of running a number of work units locally so I can compare the BOINC results to the local results. I'll probably have enough local results by next week or the week after to start properly looking at the accuracy of the data coming back from BOINC.

I'm also looking for any advice people might have on where I can look for assistance regarding the various virtualbox issues we've been having. I tried the boinc_projects mailing list, although no one replied to my message. Next, I'm going to post on the main BOINC message boards, but if anyone knows a better place to ask for BOINC + virtualbox specific advice, I'm all ears.
29 Mar 2017, 0:37:14 UTC · Discuss

Changelog 22 March 2017
Deployed a new validator that will save invalid work units locally so I can investigate possible computation errors.

Modified the assimilator to delete completed work units from the project's upload directory.

Added a new log rotator to ensure log files don't get too large, and so searching them is easier.

Made a post to the boinc_projects mailing list for assistance with some of the virtualbox issues we've been having. Hopefully someone else has some advice or knowledge that will help us solve them.

This weeks stats are as follows:

Total Cubes: 6944
Total Results: 31836
Total Canonical Results: 5021. 72.3070276498%
Average Results Per Cube: 4.58467741935

Good Results: 14426. 45.3134815932%
Bad Results: 17410. 54.6865184068%

Client Bad: 459. 1.44176404071%
Client InProgress: 16672. 52.3683879884%
Client Good: 14705. 46.1898479709%

Server Inactive: 0. 0.0%
Server Unsent: 6. 0.0188465887674%
Server InProgress: 299. 0.939188340244%
Server Over: 31531. 99.041965071%
The main reason why we have so many bad results is due to both myself deploying a buggy version of the validator, and the parameters_4.tar.gz file being removed by the boinc file deleter.
Hopefully this coming week will be interesting because the new validator should be fixed, and both the validator and assimilator will store any invalid/errored work units, which will allow me to get a much better look at work units that are actually failing.

It also seems as though developing a non-virtualbox version of the project could be a lot more difficult than I first imagined, because the Duchamp application we run relies on a set of external astronomy libraries, and I don't know if I can compile/deploy them properly for windows or mac. I'm going to do some further investigation/experimentation and see if I can work something out.[/list]
22 Mar 2017, 4:43:38 UTC · Discuss

Changelog 15 March 2017
Fixed an internal issue with the file deleter not removing old wu files.

EDIT: I was testing out a new validator for this batch, but it's causing issues so I've had to revert back to the old validator.

I'm currently doing a local run of all 7000 test work units (which will probably take a while...) so we have a set of known correct results to compare the project's results to. This way we can get a much more accurate estimate of the correctness of the results.

I also found out that somehow, only 2000 of the 7000 work units were sent out last week. Don't know how that happened, but you'll be getting all 7000 this time :)

This weeks statistics:

Total Cubes: 1968
Total Results: 4164
Total Canonical Results: 1848. 93.9024390244%
Average Results Per Cube: 2.11585365854

Good Results: 3940. 94.6205571566%
Bad Results: 224. 5.37944284342%

Client Bad: 210. 5.04322766571%
Client InProgress: 14. 0.336215177714%
Client Good: 3940. 94.6205571566%

Server Inactive: 0. 0.0%
Server Unsent: 0. 0.0%
Server InProgress: 120. 2.8818443804%
Server Over: 4044. 97.1181556196%
We actually have a slightly higher percentage of good results this time (about 1%) which is good.
We're still aiming to get that up to 95% though.

Additionally, I'm wondering if it would be worth investing the time in a version of this project that doesn't use virtualbox. Currently, vboxwrapper and virtualbox issues have been the biggest causes of errors for this project, and a lot of that might be able to be solved by simply bypassing virtualbox all together and just having some native binaries for each platform.
The actual core of the client is written in Python, but I believe I can compile that to a binary using Cython (or a similar tool).
What do you all think, is it worth trying a non-virtualbox approach to avoid the problems we've been having?
15 Mar 2017, 0:24:11 UTC · Discuss

Changelog 8 March 2017
Profiles are now enabled.
Team import is now enabled.

Modified the /duchamp/join.php page to specify that this project requires VirtualBox.

Configured the Akismet anti-spam system and reCAPTCHA to prevent forum spam.

Wrote up a script to report the work unit statistics for a particular week.

There'll be more work coming out very soon after this changelog, 4x the amount as last week!

This week's current statistics are as follows:

Total Work Units: 1736
Total Results: 3711
Total Canonical Results: 1663. 95.7949308756%
Average Results Per Cube: 2.13767281106

Good Results: 3474.              93.6135812449%
Bad Results: 237.                   6.38641875505%

Client Bad: 236.                      6.35947184047%
Client InProgress: 1.               0.0269469145783%
Client Good: 3474.                 93.6135812449%

Server Inactive: 0.                  0.0%
Server Unsent: 0.                   0.0%
Server InProgress: 76.           2.04796550795%
Server Over: 3635.                 97.9520344921%
The key thing here are the canonical results, which we're already doing pretty well on with only 5% work units still in process since last Wednesday.
We're getting around 2 and-a-bit results (tasks) per work unit, which is close to what we want. Ideally, we need as close to 2 results per work unit as possible, as this means there were no computation errors and no instances of Boinc needing to send out duplicate tasks.
We have 93.6% successful results, with 6.4% result failures, so that's also a pretty solid starting point. Our aim is to get this value down to below 5%.
8 Mar 2017, 6:26:14 UTC · Discuss

Future transition from beta to production.
Hi everyone,

So I thought I should lay out my plans for how Sourcefinder is going to transition from its current beta state in to a production state.

From now on, the work units pushed out each week (on a Wednesday) will have their error rates measured. Any work units that fail due to computation errors or Boinc errors will be tracked and logged. Note that work units with invalid results wont be counted because they still represent a valid computation. This will continue until we're consistently getting less than 5% of work units being lost to errors, at that point I'll start looking at transitioning the current Sourcefinder server to a proper full production state.
From now on I'll be including the current work unit error rates in the weekly changelog.

If anyone has any thoughts on this, please let me know.
I'm also interested in whether my 5% target is too much, and whether I should relax it to 10% or something similar.

Thanks,
Sam
1 Mar 2017, 5:57:06 UTC · Discuss


Changelog 1 March 2017
Fixed an issue with automatic server tasks not running. The boinc stats dump should now update every 3 hours.

Added the following config entry to the project configuration:

<resend_lost_results>1</resend_lost_results>

Increased the credit multiplier from 1.3 to 3.0.

Implemented a new validator that will be tested throughout the week. Aims to solve some strange issues popping up with the boinc sample validator.

And of course, more work units to process. :)
1 Mar 2017, 5:49:30 UTC · Discuss

Changelog 20th February 2017
Enabled the db exporter to allow users to show their stats for this project. Please wait up to 24hours before the first set of stats are exported!

Bumped up the credit multiplier for workunits from 1 to 1.3. This should hopefully provide people with a more adequate number of credits for each work unit.

Added the following lines to the vbox_job.xml config file:

<enable_vm_savestate_usage/>
<disable_automatic_checkpoints/>
in an effort to solve a problem described here: http://54.196.169.236/duchamp/forum_thread.php?id=17

Web queries for http://54.196.169.236 will now redirect to http://54.196.169.236/duchamp

Pushed another set of work units out.
20 Feb 2017, 3:21:07 UTC · Discuss

Changelog 15th February 2017
1. Added max_wus_to_send and max_wus_in_progress to the project config. Both are set to 8, so each user should get a maximum of 8 work units to process at any one time.

2. Updated vboxwrapper applications. Apps were updated to the following version numbers:

    Mac 32bit 26178 -> 26196

    Mac 64bit 26178 -> 26196

    Linux 32bit 26178 -> 26196

    Linux 64bit 26178 -> 26198

    Windows 32bit 26179 -> 26197

    Windows 64bit 26179 -> 26198


3. New set of work units pushed out, we have about 4000 waiting in the queue.

I hope that the new wrapper applications will solve some of the issues we've been having. I'll be waiting for your feedback :).

UPDATE:

max_wus_to_send dropped to 1
max_wus_in_progress dropped to 2
max_ncpus dropped to 4
15 Feb 2017, 5:09:25 UTC · Discuss


We're back up.
Hi everyone,

I was in the process of expanding the main hard disk for the sourcefinder server, because the relatively small 50gb disk was no where near large enough to store all the required work units.
While I was attempting to expand the disk, I accidentally terminated the amazon server that ran Sourcefinder. This destroyed all data on the server and destroyed the server instance itself. This was a huge mistake on my behalf and I'm extremely sorry for this.
In the mean time I've been working hard to get a new server running, this time with a larger hard disk and termination protection so that my clumsy hands can't destroy it quite so easily.

I hope this hasn't been too much of an annoyance for many of you.

Sam
9 Feb 2017, 6:56:25 UTC · Discuss




News is available as an RSS feed   RSS


©2017 ICRAR