CodaLab -

> Error while uploading can't understand the meaning

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
2023-11-06 17:57:18.167343: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-11-06 17:57:18.216237: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-06 17:57:18.216311: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-06 17:57:18.216354: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-06 17:57:18.227086: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-06 17:57:20.855404: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20295 MB memory: -> device: 0, name: Quadro RTX 6000, pci bus id: 0000:88:00.0, compute capability: 7.5
Traceback (most recent call last):
File "/multiverse/storage/lattari/Prj/postdoc/Courses/AN2DL_2023/Competition1_running_dir/worker_gpu4_dir/tmp/codalab/tmpP9cuVi/run/program/score.py", line 130, in
M = model(submission_dir)
^^^^^^^^^^^^^^^^^^^^^
File "/multiverse/storage/lattari/Prj/postdoc/Courses/AN2DL_2023/Competition1_running_dir/worker_gpu4_dir/tmp/codalab/tmpP9cuVi/run/input/res/model.py", line 6, in __init__
self.model = tf.keras.models.load_model(os.path.join(path, 'SubmissionModel/BasicModel.keras'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_api.py", line 254, in load_model
return saving_lib.load_model(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 281, in load_model
raise e
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 269, in load_model
_load_state(
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 466, in _load_state
_load_container_state(
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 534, in _load_container_state
_load_state(
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 435, in _load_state
trackable.load_own_variables(weights_store.get(inner_path))
File "/usr/local/lib/python3.11/dist-packages/keras/src/engine/base_layer.py", line 3531, in load_own_variables
raise ValueError(
ValueError: Layer 'conv1' expected 2 variables, but received 0 variables during loading. Expected: ['conv1/kernel:0', 'conv1/bias:0']

The model is saved in .keras format and on my machine it can load, it can also load on external notebooks and notebooks on other PCs. I can't understand the error related to the variables really.

Posted by: Teo-For-Poli @ Nov. 7, 2023, 10:52 a.m.

Maybe the problem is a different version of Tensorflow? Please check the version of the libraries we gave.

Posted by: an2dl.competitions @ Nov. 7, 2023, 12:35 p.m.

Unfortunately it is the same version.

Posted by: Teo-For-Poli @ Nov. 7, 2023, 12:39 p.m.

My advice is then to test with a small and simple model, because as you can see from the leaderboard, the competition backend works correctly. It is just a matter of getting familiar with the .py file to send.

Posted by: an2dl.competitions @ Nov. 7, 2023, 1:58 p.m.

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
2023-11-07 14:38:20.119418: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-11-07 14:38:20.207508: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-07 14:38:20.207577: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-07 14:38:20.207639: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-07 14:38:20.233701: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-07 14:38:24.883881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22153 MB memory: -> device: 0, name: Quadro RTX 6000, pci bus id: 0000:88:00.0, compute capability: 7.5
Traceback (most recent call last):
File "/multiverse/storage/lattari/Prj/postdoc/Courses/AN2DL_2023/Competition1_running_dir/worker_gpu4_dir/tmp/codalab/tmp1VgU_Q/run/program/score.py", line 130, in
M = model(submission_dir)
^^^^^^^^^^^^^^^^^^^^^
File "/multiverse/storage/lattari/Prj/postdoc/Courses/AN2DL_2023/Competition1_running_dir/worker_gpu4_dir/tmp/codalab/tmp1VgU_Q/run/input/res/model.py", line 11, in __init__
self.model = tfk.models.load_model(os.path.join(path, 'FFNN.keras' ))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_api.py", line 254, in load_model
return saving_lib.load_model(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 281, in load_model
raise e
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 269, in load_model
_load_state(
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 466, in _load_state
_load_container_state(
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 534, in _load_container_state
_load_state(
File "/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py", line 435, in _load_state
trackable.load_own_variables(weights_store.get(inner_path))
File "/usr/local/lib/python3.11/dist-packages/keras/src/engine/base_layer.py", line 3531, in load_own_variables
raise ValueError(
ValueError: Layer 'dense_6' expected 2 variables, but received 0 variables during loading. Expected: ['dense_6/kernel:0', 'dense_6/bias:0']

I realized a simple FFNN but doesn't work the same, giving the same error.

Here is my code for the model.py:

import os
import tensorflow as tf
from tensorflow import keras as tfk
from keras import layers as tfkl
import numpy as np

class model:
seed = 33
def __init__(self, path):
self.model = tfk.models.load_model(os.path.join(path, 'FFNN.keras' ))

def predict(self, X):

y_test = np.array(X)

out = self.model.predict(y_test)
out = np.argmax(out, axis = -1) # Shape [BS]

return out

Posted by: Teo-For-Poli @ Nov. 7, 2023, 2:40 p.m.

If I create the model in the script and then load tyhe weights it seems to solve the issue but now this pops up and is realted to a script running in the backend:

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
2023-11-07 14:53:53.983957: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-11-07 14:53:54.231663: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-07 14:53:54.231731: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-07 14:53:54.231787: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-07 14:53:54.512655: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-07 14:53:59.879878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22153 MB memory: -> device: 0, name: Quadro RTX 6000, pci bus id: 0000:88:00.0, compute capability: 7.5
2023-11-07 14:54:02.171616: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
Traceback (most recent call last):
File "/multiverse/storage/lattari/Prj/postdoc/Courses/AN2DL_2023/Competition1_running_dir/worker_gpu4_dir/tmp/codalab/tmpNUPFV_/run/program/score.py", line 189, in
predicted = predicted.numpy()
^^^^^^^^^^^^^^^
AttributeError: 'numpy.ndarray' object has no attribute 'numpy'. Did you mean: 'dump'?

Posted by: Teo-For-Poli @ Nov. 7, 2023, 2:56 p.m.

The output is expected to be a Tensor not a numpy array. Use tf.argmax instead.

Posted by: an2dl.competitions @ Nov. 7, 2023, 3:02 p.m.

Post in this thread

Forums

Artificial Neural Networks and Deep Learning 2023 - Homework 1 Forum

> Error while uploading can't understand the meaning