SoFiA Beta Update 5

Message boards : News : SoFiA Beta Update 5

To post messages, you must log in.

AuthorMessage
Profile Sam
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 9 Feb 17
Posts: 216
Credit: 7,636
RAC: 0
Message 822 - Posted: 11 Oct 2017, 2:54:41 UTC
Last modified: 11 Oct 2017, 6:08:50 UTC

Reduced logging in the assimilator logs so I can actually find things in them.

Fixed an extremely unlikely bug in the assimilator that fires if amazon's S3 servers can't be reached.
Fixed a bug with the assimilator attempting to remove temporary folders that don't exist.

It also appears as though one of the SoFiA result parameters, Wm50, is non-deterministic. Some runs calculate it without error, but others fail. Here's an example error log from two different runs of the same parameter set:

Result 1 (courtesy of emoga):
Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.
Warning (Parametrization): Kinematic major axis derived from just 2 data points.
Error (Parametrization): Calculation of Wm50 failed (3).
Warning (Parametrization): Failed to measure source line width.
Error (Parametrization): Cannot fit ellipse, source flux <= 0.
Warning (Parametrization): Ellipse fit failed.
Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.
Result 2 (courtesy of LCB001)
Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.
Warning (Parametrization): Kinematic major axis derived from just 2 data points.
Error (Parametrization): Cannot fit ellipse, source flux <= 0.
Warning (Parametrization): Ellipse fit failed.
Warning (Parametrization): Cannot determine kinematic major axis. Source too faint.
Warning (Parametrization): Measurement of kinematic PA failed.

Note how in the first case, the Wm50 parameter fails to calculate, but in the second there's no issue calculating it.
This is something I'll bring up with the SoFiA team, as it might be an issue somewhere that they can address.

I've also received a set of parameter files that the scientists would like us to run, so for this week we'll be running a test on those parameters to make sure everything works fine.
There are a total of 486 parameters to run per cubelet (that's 2x more than duchamp!) so work units should take a bit longer to complete than normal. I tested the parameters locally and they took around 30 mins to process one work unit.
ID: 822 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bill F

Send message
Joined: 9 Feb 17
Posts: 26
Credit: 60,629
RAC: 230
Message 825 - Posted: 12 Oct 2017, 2:47:35 UTC

When will you release the first batch of the new Work Units ?

Thanks
Bill F
ID: 825 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Sam
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 9 Feb 17
Posts: 216
Credit: 7,636
RAC: 0
Message 826 - Posted: 15 Oct 2017, 5:01:17 UTC - in response to Message 825.  

The work unit creation script seems to be having problems. I'm trying to fix it now.
ID: 826 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yavanius
Volunteer moderator
Avatar

Send message
Joined: 12 Feb 17
Posts: 121
Credit: 163,211
RAC: 2,542
Message 827 - Posted: 15 Oct 2017, 17:44:25 UTC - in response to Message 826.  

Hey Sam,

I was just thinking, how about releasing the same WU twice to a host. One will run on Duchamp and one on SOFIA. That should give you a pretty good list directly compare results from the same hosts as well as possibly aiding troubleshooting.

~Yav
ID: 827 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
LumenDan

Send message
Joined: 9 Feb 17
Posts: 88
Credit: 164,599
RAC: 610
Message 831 - Posted: 18 Oct 2017, 11:47:37 UTC - in response to Message 827.  

I have completed some batch #10 work units without any problems, the tasks are running for approximately 90 minutes on my i7 computer.
ID: 831 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile redtiger

Send message
Joined: 9 Feb 17
Posts: 30
Credit: 160,591
RAC: 599
Message 832 - Posted: 18 Oct 2017, 21:14:03 UTC - in response to Message 831.  

Seem to be getting a lot of tasks with Completed, validation inconclusive.
Have not had one validated yet.
ID: 832 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Sam
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Help desk expert

Send message
Joined: 9 Feb 17
Posts: 216
Credit: 7,636
RAC: 0
Message 834 - Posted: 19 Oct 2017, 0:27:38 UTC - in response to Message 831.  
Last modified: 19 Oct 2017, 0:29:55 UTC

90 minutes is certainly much longer than the previous WUs. Having 480 odd parameters to run significantly increases the runtime (as expected). I almost wonder if 90 minutes is too long, and whether I should split the parameter set in two.
What do you all think?

EDIT: I'm also noticing a large number of validation inconclusive tasks. I'll take a look at the validator logs when I get the chance.
ID: 834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
LumenDan

Send message
Joined: 9 Feb 17
Posts: 88
Credit: 164,599
RAC: 610
Message 837 - Posted: 19 Oct 2017, 2:47:53 UTC - in response to Message 834.  
Last modified: 19 Oct 2017, 2:48:48 UTC

I think between 1 and 6 hours processing time is fair, you probably want to nominate a computer with minimal capabilities and set the parameter size to make sure that computer could finish a task in a reasonable amount of time.
Distributed computing is usually more efficient with longer run times depending on what your priorities are, having short work units increases the overheads related of data transfer per work unit.
ID: 837 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Phil
Avatar

Send message
Joined: 9 Feb 17
Posts: 15
Credit: 155,143
RAC: 1,723
Message 841 - Posted: 19 Oct 2017, 16:36:15 UTC - in response to Message 834.  

90 minutes is certainly much longer than the previous WUs. Having 480 odd parameters to run significantly increases the runtime (as expected). I almost wonder if 90 minutes is too long, and whether I should split the parameter set in two.
What do you all think?
I reckon 2-4hour tasks are fine, but in the testing phase much shorter ones might help you.
I'm also noticing a large number of validation inconclusive tasks. I'll take a look at the validator logs when I get the chance.

It might help to cancel most of the tasks in the queue too, so that the non-validated tasks get a chance to be recalculated.
ID: 841 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Werinbert

Send message
Joined: 30 Jul 17
Posts: 1
Credit: 10,049
RAC: 22
Message 842 - Posted: 20 Oct 2017, 11:54:28 UTC

I agree with Phil, 2-4 hours is fine for tasks...

But I would prefer that they do validate. This current batch still has too many can't validate or invalid results.
ID: 842 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : News : SoFiA Beta Update 5


©2017 ICRAR